NFS

Network File System (NFS) is a networking protocol for distributed file sharing. A file system defines the way data in the form of files is stored and retrieved from storage devices, such as hard disk drives, solid-state drives and tape drives. NFS is a network file sharing protocol that defines the way files are stored and retrieved from storage devices across networks.

Note

NFS is a mandatory feature for all clusters set up by Omnia.

Pre requisites

  • NFS is set up on Omnia clusters based on the inputs provided in input/storage_config.yml.

    Parameter

    Details

    nfs_client_params

    JSON List

    Required

    • This JSON list contains all parameters required to set up NFS.

    • For a bolt-on set up where there is a pre-existing NFS server, set nfs_server to false.

    • When nfs_server is set to true, an NFS share is created on a server IP in the cluster for access by all other cluster nodes.

    • Ensure that the value of share_path in input/omnia_config.yml matches at least one of the client_share_path values in the JSON list provided.

    • For more information on the different kinds of configuration available, click here.

    ../../_images/nfs_flowchart.png
    • The fields listed in nfs_client_params are:

      • server_ip: IP of the intended NFS server. To set up an NFS server on the control plane, use the value localhost. Use an IP address to configure access anywhere else.

      • server_share_path: Folder on which the NFS server mounted.

      • client_share_path: Target directory for the NFS mount on the client. If left empty, the respective server_share_path value will be taken for client_share_path.

      • nfs_server: Indicates whether an external NFS server is available (false) or an NFS server will need to be created (true).

      • slurm_share: Indicates that the target cluster uses Slurm.

      • k8s_share: Indicates that the target cluster uses Kubernetes.

    Note

    To install any Benchmarking software like UCX or OpenMPI, at least slurm_share or k8s_share should be set to true. If both are set to true, a higher precedence is given to slurm_share.

    To configure all cluster nodes to access a single external NFS server export, use the below sample:

    - { server_ip: 10.5.0.101, server_share_path: "/mnt/share", client_share_path: "/home", client_mount_options: "nosuid,rw,sync,hard", nfs_server: true, slurm_share: true, k8s_share: true }
    

    To configure the cluster nodes to access a new NFS server on the control plane as well as an external NFS server, use the below example:

    - { server_ip: localhost, server_share_path: "/mnt/share1", client_share_path: "/home", client_mount_options: "nosuid,rw,sync,hard", nfs_server: true, slurm_share: true, k8s_share: true }
    - { server_ip: 198.168.0.1, server_share_path: "/mnt/share2", client_share_path: "/mnt/mount2", client_mount_options: "nosuid,rw,sync,hard", nfs_server: false, slurm_share: true, k8s_share: true }
    

    To configure the cluster nodes to access new NFS server exports on the cluster nodes, use the below sample:

    - { server_ip: 198.168.0.1, server_share_path: "/mnt/share1", client_share_path: "/mnt/mount1", client_mount_options: "nosuid,rw,sync,hard", nfs_server: false, slurm_share: true, k8s_share: true }
    - { server_ip: 198.168.0.2, server_share_path: "/mnt/share2", client_share_path: "/mnt/mount2", client_mount_options: "nosuid,rw,sync,hard", nfs_server: false, slurm_share: true, k8s_share: true }
    
  • Ensure that an NFS local repository is created by including {"name": "nfs"}, in input/software_config.json. For more information, click here.

  • If the intended cluster will run Slurm, set the value of slurm_installation_type in input/omnia_config.yml to nfs_share.

  • If an external NFS share is used, make sure that /etc/exports on the NFS server is populated with the same paths listed as server_share_path in the nfs_client_params in input/storage_config.yml.

  • Omnia supports all NFS mount options. Without user input, the default mount options are nosuid,rw,sync,hard,intr.

Running the playbook

Run the storage.yml playbook :

cd storage
ansible-playbook storage.yml -i inventory

Use the linked inventory file for the above playbook.

Post configuration, enable the following services (using this command: firewall-cmd --permanent --add-service=<service name>) and then reload the firewall (using this command: firewall-cmd --reload).

  • nfs

  • rpc-bind

  • mountd

Caution

  • After an NFS client is configured, if the NFS server is rebooted, the client may not be able to reach the server. In those cases, restart the NFS services on the server using the below command:

    systemctl disable nfs-server
    systemctl enable nfs-server
    systemctl restart nfs-server
    
  • When nfs_server is false, enable the following services after configuration using this command: firewall-cmd --permanent --add-service=<service name>) and then reload the firewall (using this command: firewall-cmd --reload).

    • nfs

    • rpc-bind

    • mountd

If you have any feedback about Omnia documentation, please reach out at omnia.readme@dell.com.