In my Rook Ceph series, I’ve given first an introduction of Rook Ceph. Then I’ve shared the procedure on how to do a disk extension. With this blog post, I’d like to share how your files can be accessed in such a cluster. We are using the Ceph File System (CephFS) and if you are new to it, here are some interesting and useful tips and tricks to know.

We saw in the introduction of Rook Ceph that our files are converted to objects in order to be manipulated, replicated and stored on physical hard disks. So you may wonder if there is a way to see directly those files in the Rook Ceph cluster and how they are organised. Read on to find out!

Storage in the Kubernetes cluster

When Rook Ceph is installed in a Kubernetes cluster, there are two storage class types available in the cluster: RBD and CephFS. You can check them as shown below:

$ kubectl get sc
NAME                        PROVISIONER                            RECLAIMPOLICY   VOLUMEBINDINGMODE   ALLOWVOLUMEEXPANSION   AGE
ceph-block                  rookceph.rbd.csi.ceph.com      Delete          Immediate           true                   20d
ceph-filesystem (default)   rookceph.cephfs.csi.ceph.com   Delete          Immediate           true                   20d

Rook Ceph creates automatically a Persistent Volume (PV) according to a Persistent Volume Claim (PVC) request. You can view them with the usual kubectl commands to get PV and PVC. If you want to increase the PV size then increase the PVC request by editing it with the following command for example:

$ kubectl edit pvc pvc-35e9b837-1cc1-4668-9486-070fe2ac1e39 -n rookceph

Connect to CephFS through a Pod

The first step is to get and apply the following deployment in the Kubernetes cluster. This file needs to be modified first with the proper Rook Ceph namespace you are currently using in the Kubernetes cluster. For this example we will use the namespace rookceph. You can then check the created Pod and connect into it:

$ kubectl get pods -n rookceph |grep direct-mount
rook-direct-mount-786f8fb967-tjmjd                                1/1     Running     0          3h43m

$ kubectl exec -it rook-direct-mount-786f8fb967-tjmjd  -n rookceph -- bash

There are then several steps to perform in this Pod to access the files in CephFS:

# Create the directory
$ mkdir /tmp/rookmount

# Detect the monitoring endpoints and the user secret for the connection
$ mon_endpoints=$(grep mon_host /etc/ceph/ceph.conf | awk '{print $3}')
$ my_secret=$(grep key /etc/ceph/keyring | awk '{print $3}')

# Mount the filesystem
$ mount -t ceph -o name=admin,secret=$my_secret $mon_endpoints:/ /tmp/rookmount

Files organisation in CephFS

We now have a view of all files stored in this CephFS from this Pod. Files are stored into the following folder:

# pwd
/tmp/rookmount/volumes/csi

=> Under this folder there is one “csi-vol-…” folder per PV created in the Kubernetes cluster.

In another terminal windows, connect to this Kubernetes cluster and check the description of one PV. We can then see the link between this PV and the folder stored in our Pod accessing the CephFS. For example let’s have a look at a PV used by Nexus:

$ kubectl describe pv pvc-e84861f0-fe9a-409f-8198-de0b3ee5f7d0
Name:            pvc-e84861f0-fe9a-409f-8198-de0b3ee5f7d0
Labels:          <none>
Annotations:     pv.kubernetes.io/provisioned-by: rookceph.cephfs.csi.ceph.com
                 volume.kubernetes.io/provisioner-deletion-secret-name: rook-csi-cephfs-provisioner
                 volume.kubernetes.io/provisioner-deletion-secret-namespace: rookceph
Finalizers:      [kubernetes.io/pv-protection]
StorageClass:    ceph-filesystem
Status:          Bound
Claim:           nexus/nexus-nexus-repository-manager-data
Reclaim Policy:  Delete
Access Modes:    RWO
VolumeMode:      Filesystem
Capacity:        200Gi
Node Affinity:   <none>
Message:
Source:
    Type:              CSI (a Container Storage Interface (CSI) volume source)
    Driver:            rookceph.cephfs.csi.ceph.com
    FSType:            ext4
    VolumeHandle:      0001-0010-rookceph-0000000000000001-73d78a31-9641-11ed-9175-be56cced9151
    ReadOnly:          false
    VolumeAttributes:      clusterID=rookceph
                           fsName=ceph-filesystem
                           pool=ceph-filesystem-data0
                           storage.kubernetes.io/csiProvisionerIdentity=1673943489984-8081-rookceph.cephfs.csi.ceph.com
                           subvolumeName=csi-vol-73d78a31-9641-11ed-9175-be56cced9151
                           subvolumePath=/volumes/csi/csi-vol-73d78a31-9641-11ed-9175-be56cced9151/9b3f9bda-0710-4db6-bf04-e48d038ab6bf
Events:                <none>

=> The value of the parameter subvolumeName of this PV is also the name of the folder we can see in the mount folder of our Pod.

So the corresponding folder of this PV is:

# ls | grep 73d78a31
csi-vol-73d78a31-9641-11ed-9175-be56cced9151

If you know the file name you are looking for, then the easiest way to find it is to search for it from the /tmp/rookmount/volumes/csi folder:

# find . -name dump.rdb
./csi-vol-36eb3462-9640-11ed-9175-be56cced9151/288b364c-340f-4d2f-bec6-b9a4b7daa2b0/dump.rdb

When you are done exploring the files, do not forget to perform the unmount in the Pod before exiting it !

Conclusion

By accessing files directly in the Ceph File System, you can not only view files but also make a copy of them to transfer them elsewhere for example. It could also just help you when doing some troubleshooting. I hope you have learned something useful you can use in your environment!