Alternate method to install the Intel Gaudi Software Stack and Driver
The accelerator role allows users to set up the Intel Gaudi Software Stack and Driver. This tools allow users to unlock the potential of installed Intel Gaudi accelerators.
Prerequisites
The Intel Gaudi local repositories must be configured using the local_repo.yml script.
The
input/software_config.jsonmust contain validintelgaudiversion. See input parameters for more information.
Note
Intel Gaudi platform is only supported on Ubuntu 22.04 or 24.04 clusters containing Intel Gaudi accelerators.
Playbook configurations
The following configurations takes place while running the accelerator.yml playbook:
Servers with Intel Gaudi accelerators are identified and the latest drivers and software stack are downloaded and installed.
Servers with no accelerator are skipped.
Note
If the input/software_config.json file contains both intelgaudi and bcm_roce softwares, only the Intel Gaudi software and drivers are installed. The BCM RoCE drivers will not be installed on the nodes.
Executing the playbook
To install all the latest drivers and toolkits, run:
cd accelerator
ansible-playbook accelerator.yml -i inventory
Note
While executing the
accelerator.ymlplaybook for Intel Gaudi nodes, a Cron job is run which brings up the Intel Gaudi scale-out network interfaces.If a node contains an Intel Gaudi GPU with internet access during provisioning, then the user needs to install the Gaudi driver using the
accelerator.ymlplaybook.
If you have any feedback about Omnia documentation, please reach out at omnia.readme@dell.com.