Quick Installation Guide
Choose a server outside your intended cluster to function as your control plane.
Ensure that the control plane server meets the below mentioned space requirements:
For all available software packages that Omnia supports: 50GB
For the complete set of software images (in
/var): 500GBFor storing offline repositories (the file path should be specified in
repo_store_pathininput/local_repo_config.yml): 50GB
The control plane needs to be internet-capable with Git installed. Additionally, the control plane must have a full-featured operating system installed.
Note
Omnia can be run on control planes running RHEL, Rocky Linux, and Ubuntu. For a complete list of versions supported, check out the Support Matrix.
To install Git on RHEL and Rocky Linux installations, use the following command:
dnf install git -y
To install Git on Ubuntu installations, use the following command:
apt install git -y
Note
Optionally, if the control plane has an Infiniband NIC installed on RHEL or Rocky Linux, run the below command:
yum groupinstall "Infiniband Support" -y
Clone the Omnia repository from GitHub on to the control plane, using the following command:
git clone https://github.com/dell/omnia.git
Once the cloning process is complete, change directory to Omnia and run the
prereq.shscript to verify that the system is ready for Omnia deployment, using the following command:cd omnia ./prereq.sh
Note
The permissions on the Omnia directory are set to 0755 by default. Do not change these values.
- Running prereq.sh
- Local repositories for the cluster
- Installing the provision tool
- Creating node inventory
- Configuring the cluster
- Input parameters for the cluster
- Before you build clusters
- Building clusters
- Install Kubernetes
- Kubernetes plugin for RoCE NIC
- Install Slurm
- Configuring UCX and OpenMPI on the cluster
- Centralized authentication on the cluster
- Granting Kubernetes access
- BeeGFS bolt on
- NFS
- Install the ROCm platform for AMD GPUs
- Installing AI tools
- Adding new nodes
- Re-provisioning the cluster
- Configuring switches
- Configuring PowerVault
- Running HPC benchmarks on omnia clusters
- Download custom packages/images to the cluster
- Remove Slurm/K8s configuration from a node
- Soft reset the cluster
- Delete provisioned node
- Uninstalling the provision tool
If you have any feedback about Omnia documentation, please reach out at omnia.readme@dell.com.