4.3. Getting Started

Our recommended way to install a production ready EOS instance is using RPM configuration. A Kubernetes-based demonstrator is avilable for testing purposes.

EOS installation with Kubernetes and Helm for test / demonstration

The EOS Charts repository provides Helm charts for the deployment of EOS in Kubernetes for test and demonstration purposes. A working Kubernetes cluster (v1.20.15 or newer) and the Helm package manager (v3.8.0 or newer) are required.

The installation is fully automated via the server chart, which deploys 1 MGM, 3 QDB instances in cluster mode, and 4 FSTs.

helm install eos oci://registry.cern.ch/eos/charts/server

The resulting cluster will consist of 8 pods:

kubectl get pods
NAME        READY   STATUS    RESTARTS   AGE
eos-fst-0   1/1     Running   0          4m47s
eos-fst-1   1/1     Running   0          79s
eos-fst-2   1/1     Running   0          69s
eos-fst-3   1/1     Running   0          59s
eos-mgm-0   2/2     Running   0          4m47s
eos-qdb-0   1/1     Running   0          4m47s
eos-qdb-1   1/1     Running   0          2m21s
eos-qdb-2   1/1     Running   0          2m6s

…and will be configured with relevant defaults to make it a fully-working instance:

eos ns
# ------------------------------------------------------------------------------------
# Namespace Statistics
# ------------------------------------------------------------------------------------
ALL      Files                            5 [booted] (0s)
ALL      Directories                      11
ALL      Total boot time                  0 s
# ------------------------------------------------------------------------------------
ALL      Compactification                 status=off waitstart=0 interval=0 ratio-file=0.0:1 ratio-dir=0.0:1
# ------------------------------------------------------------------------------------
ALL      Replication                      mode=master-rw state=master-rw master=eos-mgm-0.eos-mgm.default.svc.cluster.local configdir=/var/eos/config/ config=default
# ------------------------------------------------------------------------------------
{...cut...}

eos fs ls
┌───────────────────────────────────────────┬────┬──────┬────────────────────────────────┬────────────────┬────────────────┬────────────┬──────────────┬────────────┬────────┬────────────────┐
│host                                       │port│    id│                            path│      schedgroup│          geotag│        boot│  configstatus│       drain│  active│          health│
└───────────────────────────────────────────┴────┴──────┴────────────────────────────────┴────────────────┴────────────────┴────────────┴──────────────┴────────────┴────────┴────────────────┘
 eos-fst-0.eos-fst.default.svc.cluster.local 1095      1                     /fst_storage        default.0      docker::k8s       booted             rw      nodrain   online              N/A
 eos-fst-1.eos-fst.default.svc.cluster.local 1095      2                     /fst_storage        default.1      docker::k8s       booted             rw      nodrain   online              N/A
 eos-fst-2.eos-fst.default.svc.cluster.local 1095      3                     /fst_storage        default.2      docker::k8s       booted             rw      nodrain   online              N/A
 eos-fst-3.eos-fst.default.svc.cluster.local 1095      4                     /fst_storage        default.3      docker::k8s       booted             rw      nodrain   online              N/A

EOS up in few minutes

We will start setting up a complete EOS installation on a single physical machine using the EOS5 configuration method. Later we will demonstrate the steps to add high-availablity to MGMs/QDBs and how to scale-out FST nodes. All commands have to be issued using the root account!

Grab a machine preferably with Alma9 (or Alma8 or CentOS7). Only the repository setup differs for these platforms (shown is the Alma9 installation, just replace 9 with 7,8 in the URLs :

Installation

yum-config-manager --add-repo "https://storage-ci.web.cern.ch/storage-ci/eos/diopside/tag/testing/el-9/x86_64/"
yum-config-manager --add-repo "https://storage-ci.web.cern.ch/storage-ci/eos/diopside-depend/el-9/x86_64/"
yum install -y eos-server eos-quarkdb eos-fusex jemalloc-devel --nogpgcheck

Unique Instance Shared Secret

Every instance should run with a unique instance-private shared secret. This can be easily created using:

eos daemon sss recreate

The command will create a local file /etc/eos.keytab storing the instance-specific shared secret needed for MGM,FST,MQ (and additionally a shared secret /etc/eos/fuse.sss.keytab useful for clients when doing FUSE mounts in combination with EOS token).

Start Services

We will startup thre services in a manual way to get a better understanding about the procedure and the used configuration.

To shorten the setup we disable the firewall for the moment. The ports to open in the firewall are explained later.

systemctl stop firewalld

Note

After each daemon run the shell should hang with the daemon in foreground. If the startup fails, the process will exit. If the startup is successfull use Control-Z `and type `bg to put the process in the background and continue with the next service until all four have been started.

# start QuarkDB on this host
eos daemon run qdb
# start MGM on this host
eos daemon run mgm
# all this host to connect as an FST
eos node set `hostname -f`:1095 on
# start FST on this host
eos daemon run fst

Note

Each command prints commands executed during the daemon initialization phase and the XRootD configuration file used. In reality each EOS service is an XRootD server process with dedicated plug-in and configuration. The init phases have been designed to be able to startup a service without doing ANY customized configuration bringing good defaults.

You should be able to see the running daemons doing:

The production way to do this is to run

# start QuarkDB
systemctl start eos5-qdb@qdb
# start MGM
systemctl start eos5-mgm@mgm
# allow this host connect as an FST
eos node set `hostname -f`:1095 on
# start FST on this host
systemctl start eos5-fst@fst

and to enable the services in the boot procedure

systemctl enable eos5-qdb@qdb
systemctl enable eos5-mgm@mgm
systemctl enable eos5-fst@fst
ps aux | grep eos

Using the CLI

Your EOS installation is now up and running. We are now starting the CLI to inspect and configure our EOS instance:

[root@vm root]# eos version
EOS_INSTANCE=eosdev
EOS_SERVER_VERSION=5.2.5 EOS_SERVER_RELEASE=5.2.5
EOS_CLIENT_VERSION=5.2.5 EOS_CLIENT_RELEASE=5.2.5

[root@vm root]# eos whoami
Virtual Identity: uid=0 (0,3,99) gid=0 (0,4,99) [authz:sss] sudo* host=localhost domain=localdomain

You can navigate the namespace using well known commands:

[root@vm root]# eos ls -la /eos/
drwxrwx--x   1 root     root            23249 Jan  1  1970 .
drwxr-x--x   1 root     root                0 Jan  1  1970 ..
drwxrwx--x   1 root     root            23249 Aug 18 17:28 dev

The default EOS instance name is eosdev and in every EOS instance you will the find the pre-created directory structure like shown:

[root@vm root]# eos find -d /eos/
path=/eos/
path=/eos/dev/
path=/eos/dev/proc/
path=/eos/dev/proc/archive/
path=/eos/dev/proc/clone/
path=/eos/dev/proc/conversion/
path=/eos/dev/proc/recycle/
path=/eos/dev/proc/tape-rest-api/
path=/eos/dev/proc/tape-rest-api/bulkrequests/
path=/eos/dev/proc/tape-rest-api/bulkrequests/evict/
path=/eos/dev/proc/tape-rest-api/bulkrequests/stage/
path=/eos/dev/proc/token/
path=/eos/dev/proc/tracker/
path=/eos/dev/proc/workflow/

All EOS instance names have to start with eos prefix (eosxyz). If you configure your EOS instance to have name eosfoo you will see an automatic structure created during MGM startup which looks like this:

/eos/
/eos/foo/
...

Adding Storage Space

The first thing we do is to create the default space, which will host all our filesystems:

eos space define default

Now we want to attach local disk space to our EOS instance into the default space . In this example we will register six filesystems to our instance. The filesystems can be on a single or individual partitions.

# create four directories to be used as separate EOS filesystems and own them with the `daemon` account
for name in 01 02 03 04 05 06; do
    mkdir -p /data/fst/$name;
chown daemon:daemon /data/fst/$name
done
# register all sub-directories under /data/fst as EOS filesystems
eosfstregister -r localhost /data/fst/ default:6

The eosfstregister command lists all directories under /data/fst/ and assumes that is has to register 6 filesystem to the default space indicated by the parameter default:6 (See eosfstregister -h for the command syntax) to the MGM running on localhost. Before filesystems are usable, they have to be owned by the daemon account.

We do now one additional step. By default EOS will place each filesystem from the same node to a separate placement group, so it will create 6 scheduling groups default.0, default.1default.6 and place filesystem 1 in default.0, 2 into default.1 aso … To write a file EOS selects a group and tries place the file into a single group. If you want now to write files with two replicas you have to have at least 2 filesystems per group, if you want to use erasure coding e.g. RAID6, you would need to have 6 filesystems per group. Therefore we now move all disks into the default.0 group (disk 1 is already in group default.0):

for name in 2 3 4 5 6; do eos fs mv --force $name default.0; done

Exploring EOS Views

Now you are ready to check-out the four views EOS provides:

eos space ls
eos node ls
eos group ls
eos fs ls

All this commands take several additional output options to provide more information e.g. eos space ls -l or eos space ls –io … You will notice, that in all this views you either see active=0 or offline. This is because we have registered filesystems, but we didn’t enable them yet.

Enabling EOS Space

The last step before using our storage setup is to enable the default space:

eos space set default on

Enabling the space means to enable all nodes, groups and filesystems in that space.

Now you can now see everything as online and active in the four views.

Read and Write using CLI

We can now upload and download our first file to our storage system. We will create a new directory and define a storage policy, to store files as single replica files (one copy):

eos mkdir /eos/dev/test/                            #create directory
eos attr set default=replica /eos/dev/test/         #define default replication policy
eos attr set sys.forced.nstripes=1 /eos/dev/test/   #define to have one replica only
eos chmod 777 /eos/dev/test/                        #allow everybody to write here
eos cp /etc/passwd /eos/dev/test/                   #upload a test file
eos cp /eos/dev/test/passwd /tmp/passwd             #download the test file
diff /etc/passwd /tmp/passwd                        #compare with original file

You can list the directory where the file was stored:

eos ls -l /eos/dev/test/

and you can find out a lot information about this file e.g. the adler32 checksum which was configured automatically doing eos attr set default=replica /eos/dev/test and the location of our file (on which filesystem the files has been stored):

eos file info /eos/dev/test/passwd

Read and Write using /eos/ mounts

We can FUSE mount our EOS instance on the same node by just doing:

mkdir -p /eos/
# put your host.domain name in the command
eosxd -ofsname=host.domain:/eos/ /eos/

An alternative to running the eosxd executable is to use the FUSE mount type:

mount -t fuse eosxd -ofsname=host.domain:/eos/ /eos/

In either way, you should be able to see the mount and the configured space using df:

df /eos/

All the usual shell commands will now also work on the FUSE mount.

Note

Be aware that the default FUSE mount does not map the current uid/gid to the same uid/gid inside EOS. Moreover root access is always squashed to uid,gid=99 (nobody).

In summary on this FUSE mount with default configuration on localhost you will be mapped to user nobody inside EOS. If you copy a file on this FUSE mount to /eos/dev/test/ the file will be owned by 99/99.

Firewall Configuration for external Access

To make your instance accessible from outside you have to make sure that all the relevant ports are open for incoming traffic.

Here is a list of ports used by the various services:

Service Port
MGM (XRootD) 1094
MGM (FUSE ZMQ) 1100
FST (XRootD) 1095
QDB (REDIS) 7777

If port 1100 is not open, FUSE access still works, but FUSE clients are not displayed as being online and they don’t receive callbacks for meta-data changes e.g. changes made on another client are not immediately visible.

systemctl start firewalld
for port in 1094 1095 1097 1100 7777; do
 firewall-cmd --zone=public --permanent --add-port=$port/tcp
done

Single Node Quick Setup Code Snippet

yum-config-manager --add-repo "https://storage-ci.web.cern.ch/storage-ci/eos/diopside/tag/testing/el-9s/x86_64/"
yum-config-manager --add-repo "https://storage-ci.web.cern.ch/storage-ci/eos/diopside-depend/el-9s/x86_64/"
yum install -y eos-server eos-quarkdb eos-fusex jemalloc-devel --nogpgcheck

systemctl start firewalld
for port in 1094 1095 1100 7777; do
  firewall-cmd --zone=public --permanent --add-port=$port/tcp
done

eos daemon sss recreate

systemctl start eos5-qdb@qdb
systemctl start eos5-mgm@mgm
eos node set `hostname -f`:1095 on
systemctl start eos5-fst@fst

sleep 30

for name in 01 02 03 04 05 06; do
  mkdir -p /data/fst/$name;
  chown daemon:daemon /data/fst/$name
done

eos space define default

eosfstregister -r localhost /data/fst/ default:6

for name in 2 3 4 5 6; do eos fs mv --force $name default.0; done

eos space set default on

eos mkdir /eos/dev/rep-2/
eos mkdir /eos/dev/ec-42/
eos attr set default=replica /eos/dev/rep-2 /
eos attr set default=raid6 /eos/dev/ec-42/
eos chmod 777 /eos/dev/rep-2/
eos chmod 777 /eos/dev/ec-42/

mkdir -p /eos/
eosxd -ofsname=`hostname -f`:/eos/ /eos/

Adding FSTs to a single node setup

yum-config-manager --add-repo "https://storage-ci.web.cern.ch/storage-ci/eos/diopside/tag/testing/el-9s/x86_64/"
yum-config-manager --add-repo "https://storage-ci.web.cern.ch/storage-ci/eos/diopside-depend/el-9s/x86_64/"
yum install -y eos-server jemalloc-devel --nogpgcheck

systemctl start firewalld
for port in 1095; do
  firewall-cmd --zone=public --permanent --add-port=$port/tcp
done

# On the FST node configure MGM node in /etc/config/eos/generic/all
SERVER_HOST=mgmnode.domain

# Copy /etc/eos.keytab from MGM node to the new FST node to /etc/eos.keytab
scp root@mgmnode.domain:/etc/eos.keytab /etc/eos.keytab

# Allow the new FST on the MGM to connect as an FST
@mgm: eos node set fstnode.domain:1095 on

# Start FST service
systemctl start eos5-fst@fst
systemctl enable eos5-fst@fst

# Verify Node online
@mgm: eos node ls

Expanding single node MGM/QDB setup to HA cluster

In a production environment we need to have QDB and MGM service high-available. We will show here, how to configure three co-located QDB+MGM nodes.The three nodes are called in the example node1.domain node2.domain node3.domain. We assume you running mgm is node1 and new nodes are node2 and node3.

yum-config-manager --add-repo "https://storage-ci.web.cern.ch/storage-ci/eos/diopside/tag/testing/el-9s/x86_64/"
yum-config-manager --add-repo "https://storage-ci.web.cern.ch/storage-ci/eos/diopside-depend/el-9s/x86_64/"
yum install -y eos-server eos-quarkdb eos-fusex jemalloc-devel --nogpgcheck

systemctl start firewalld
for port in 1094 1100 7777; do
 firewall-cmd --zone=public --permanent --add-port=$port/tcp
done

# Copy /etc/eos.keytab from MGM node to the new MGM nodes to /etc/eos.keytab
scp root@node1:/etc/eos.keytab /etc/eos.keytab

# Create observer QDB nodes on node2 and node3
eos daemon config qdb qdb new observer

# Start QDB on node2 and node3
systemctl start eos5-qdb@qdb
systemctl enable eos5-qdb@qdb

# Allow node2 & node3 as follower on node 1
@node1: redis-cli -p 7777
@node1: 127.0.0.1:7777> raft-add-observer node2.domain:7777
@node1: 127.0.0.1:7777> raft-add-observer node3.domain:7777

( this is equivalent to 'eos daemon config qdb qdb add node2.domain:7777' but broken in the release version )

# node2 & node3 get contacted by node1 and start syncing the raft log

# Promote node2 and node3 as full members
@node1: redis-cli -p 7777
@node1: 127.0.0.1:7777> raft-promote-observer node2.domain:7777
@node1: 127.0.0.1:7777> raft-promote-observer node3.domain:7777

( this is equivalent to 'eos daemon config qdb qdb promote node2.domain:7777 )

# Verify RAFT status on any QDB node
redis-cli -p 777
127.0.0.1:7777> raft-info

( this is equivalent to 'eos daemon config qdb qdb info' )

# Startup MGM services
@node2: systemctl start eos5-mgm@mgm
@node3: systemctl start eos5-mgm@mgm

# You can connect on each node using the eos command to the local MGM
@node1:  eos ns | grep master
ALL      Replication                      is_master=true master_id=node1.domain:1094
@node2:  eos ns | grep master
ALL      Replication                      is_master=false master_id=node1.domain:1094
@node3:  eos ns | grep master
ALL      Replication                      is_master=false master_id=node1.domain:1094

# You can force the QDB leader to a given node e.g.
@node2: eos daemon config qdb qdb coup

# you can force the active MGM to run on a given node by running on the current active MGM:
@node1: eos ns master node2.domain:1094
success: current master will step down

Three Node Quick Setup Code Snippet

You can also setup a three node cluster from scratch right from the beginning, which is shown here:

# on all three nodes do
killall -9 xrootd     # make sure no daemons are running
rm -rf /var/lib/qdb/  # wipe previous QDB database

yum-config-manager --add-repo "https://storage-ci.web.cern.ch/storage-ci/eos/diopside/tag/testing/el-9/x86_64/"
yum-config-manager --add-repo "https://storage-ci.web.cern.ch/storage-ci/eos/diopside-depend/el-9/x86_64/"
yum install -y eos-server eos-quarkdb eos-fusex jemalloc-devel --nogpgcheck

systemctl start firewalld
for port in 1094 1095 1097 1100 7777; do
 firewall-cmd --zone=public --permanent --add-port=$port/tcp
done

Now edit /etc/eos/config/qdb/qdb and change the variable definition with your QDB nodes:

QDB_NODES=${QDB_HOST}:${QDB_PORT}

to

QDB_NODES=node1.cern.ch:7777,node2.cern.ch:7777,node3.cern.ch:7777

Create new instance sss keys on one node and copy them to the other two nodes:

# node 1
eos daemon sss recreate
# copy to node2,node3
scp /etc/eos.keytab root@node2:/etc/eos.keytab
scp /eos/eos/fuse.sss.keytab root@node2:/etc/eos.keytab
scp /etc/eos.keytab root@node3:/etc/eos.keytab
scp /eos/eos/fuse.sss.keytab root@node3:/etc/eos.keytab

Now start QDB on all three nodes:

systemctl start eos5-qdb@qdb

Now you can inspect the RAFT state on all QDBs:

eos daemon config qdb qdb info

1) TERM 1
2) LOG-START 0
3) LOG-SIZE 2
4) LEADER node2.cern.ch:7777
5) CLUSTER-ID eosdev
6) COMMIT-INDEX 1
7) LAST-APPLIED 1
8) BLOCKED-WRITES 0
9) LAST-STATE-CHANGE 293 (4 minutes, 53 seconds)
10) ----------
11) MYSELF node2.domain:7777
12) VERSION 5.1..5.1.3
13) STATUS LEADER
14) NODE-HEALTH GREEN
15) JOURNAL-FSYNC-POLICY sync-important-updates
16) ----------
17) MEMBERSHIP-EPOCH 0
18) NODES node1.domain:7777,node2.domaina:7777,node3.domain:7777
19) OBSERVERS
20) QUORUM-SIZE 2

Here you see that the current LEADER is node2.domain. If you want to force that node1.cern.ch becomes a leader you can type on node1:

[root@node1 ] eos daemon config qdb qdb coup

and verify using

eos daemon config qdb qdb info

who the new LEADER is.

Now we start mgm on all three nodes:

[root@node1] systemctl start eos5-mgm@mgm
[root@node2] systemctl start eos5-mgm@mgm
[root@node3] systemctl start eos5-mgm@mgm

You can connect to the MGM on each node.

[root@node1] eos ns | grep Replication
ALL      Replication                      is_master=true master_id=node1.cern.ch:1094
[root@node2] eos ns | grep Replication
ALL      Replication                      is_master=false master_id=node1.cern.ch:1094
[root@node3] eos ns | grep Replication
ALL      Replication                      is_master=false master_id=node1.cern.ch:1094

The three MGMs use a lease mechanism to acquire the active role. If you want to push manually the active role from node1 to node2, you do:

[root@node1 ] eos ns master node2.cern.ch

When the default lease time expired, the master should change:

[root@node1] eos ns | grep Replication
ALL      Replication                      is_master=false master_id=node1.cern.ch:1094
[root@node2] eos ns | grep Replication
ALL      Replication                      is_master=true master_id=node1.cern.ch:1094
[root@node2] eos ns | grep Replication
ALL      Replication                      is_master=false master_id=node1.cern.ch:1094

Note

Sometimes you might observe changes of the master under heavy load. If this happens too frequently you can increase the lease time modifying the sysconfig variable of the mgm daemon e.g. to set other 120s lease time (instead of default 60s) you define:

EOS_QDB_MASTER_LEASE_MS="120000"

Joining a node to a QDB raft cluster

The procedure to add the node foo.bar:7777 to a QDB cluster is straight-forward. QDB_PATH has to be not existing and the QDB service will be down in any case when running this command on the new/unconfigured node:

eos daemon config qdb qdb new observer

To get the node join as a member you run three commands On the new node:

1 [ @newnode ] : systemctl start eos5-qdb@qdb

=> eos daemon config qdb qdb info will not show the new node as an observer. The QDB logfile /var/log/eos/qdb/xrdlog.qdb will say, this new QDB node is still in limbo state and needs to be added!

On the elected leader node:

2 [ @leader  ] : eos daemon config qdb qdb add foo.bar:7777

=> eos daemon config qdb qdb info will show the new node as replicating and LAGGING until the synchronization is complete and status will be UP-TO-DATE. The new node is not yet a member of the cluster quorum.

On the elected leader node:

3 [ @leader  ] : eos daemon config qdb qdb promote foo.bar

=> eos daemon config qdb qdb info will show the new node as a member of the cluster under NODES.

# 5 Replacing/Removing a node in a 3 node QDB setup To replace or remove node foo.bar:7777 one needs only two to three steps: Shutdown QDB on that node:

1 [ @drainnode ] : systemctl stop eos5-qdb@qdb

Remove that node on the leader from the membership:

2 [ @leader    ] : eos daemon config qdb qdb remove foo.bar:7777

Optionally delete QDB database files on foo.bar:7777 (don’t run this on the LEADER !!!!):

3 [ @drainnode ] : rm -rf /var/lib/qdb/

Now run the _join_ procedure from the previous section on the node, which should replace the decomissioned member!

Backup your QDB database

eos daemon config qdb qdb backup