Run Cerebras Model Zoo on a GPU#

You can run models in the Cerebras Model Zoo on GPUs as well. However, specific packages must be installed to run the model code on a GPU. Make sure to install these packages in a virtual environment (virtualenv) or a conda environment.

CUDA Requirements#

CUDA libraries, such as the CUDA toolkit and the cuDNN libraries, must be installed on the system to run a model on a GPU.

To install these packages, follow the instructions on the CUDA website. In addition, include the cuDNN library installation.

PyTorch#

Currently, the Cerebras Model Zoo only supports PyTorch version 1.11, which requires CUDA version 10.1/10.2.

Once all the CUDA requirements get installed, create a virtualenv on your system with Python version 3.8 or newer, activate the virtualenv, and install Pytorch using the following commands:

$ virtualenv -p python3.8 /path/to/venv_gpu
(venv_gpu) $ source /path/to/venv_pt/bin/activate
(venv_gpu) $ pip install -r requirements_gpu_pt.txt

The requirements_gpu_pt.txt file is located in the Cerebras Model Zoo.

To test whether PyTorch can adequately access the GPU, start a Python session and run the following commands:

>>> import torch
>>> torch.__version__
1.11
>>> torch.cuda.is_available()
True  # SHOULD RETURN TRUE
>>> torch.cuda.device_count()
1  # NUMBER OF DEVICES PRESENT
>>> torch.cuda.get_device_name(0)
# SHOULD RETURN THE PROPER GPU TYPE

CUDA troubleshooting#

If you do not see GPU returned in the setup verification steps mentioned above, you can troubleshoot by verifying if all the CUDA libraries are correctly loaded. Otherwise, the output would indicate which CUDA libraries did not load correctly.

Note that some methods of installing CUDA 10.1/10.2 require installing the cuBLAS library from CUDA 10.2, while the rest of the CUDA libraries are from version 10.1. This may require adding the path to both installations’ lib64 directory to the LD_LIBRARY_PATH variable. Proceed with the following steps:

$ export CUDA_VERSION=cuda-10.1

$ # Add /usr/local/${CUDA_VERSION}/lib64 to your LD_LIBRARY_PATH.
$ export LD_LIBRARY_PATH=/usr/local/${CUDA_VERSION}/lib64:$LD_LIBRARY_PATH

$ # Add /usr/local/${CUDA_VERSION}/extras/CUPTI/lib64 to your LD_LIBRARY_PATH.
$ export LD_LIBRARY_PATH=/usr/local/${CUDA_VERSION}/extras/CUPTI/lib64:$LD_LIBRARY_PATH

$ # Add /usr/local/cuda-10.2/lib64/ to your LD_LIBRARY_PATH.
$ export LD_LIBRARY_PATH=$LD_LIBRARY_PATH:/usr/local/cuda-10.2/lib64/