We use Slurm for job submissions. Since Slurm is widely used for job submissions in clusters, it has extensive documentation available for reference on the internet forums. For example, if you would like to run parallel jobs on multiple nodes, you can check out Multi-parallel-jobs for details. This section lists the most common usage of Slurm for job submissions along with some details specific to the GAIVI cluster. Please contact the admins for any assistance or questions.
First, you need to create a bash script like this:
$ cat sample_script.sh #!/bin/bash -l #SBATCH -o std_out #SBATCH -e std_err srun python some_file.py srun sh some_file.sh
Then run this to submit the job:
$ sbatch sample_sript.sh
The lines that start with #SBATCH
are options for sbatch
. Here, the -o
option specifies that the script’s standard output to be written to the std_out
file and the -e
option specifies that the script’s standard error to be written to the std_err
file. Other common sbatch
options are:
-D [path]
: change working directory to [path]
. -w [node_name]
: request a specific compute node.
The computing processes should start with srun
, followed by your command-line program, such as python, bash, java, or compiled binary files. The only commands that you don't have to append with srun
are environment-related commands, such as conda activate some_environment
. Each of these is considered by slurm as a “job step” and can make use of any subset of the resources allocated to the job. Without passing any flags to srun
, a step consumes the entire allocation of the job while it runs.
For a complete list of #SBATCH
options, please visit here.
The script above will allocate the default amount of resources (1 CPU, 16GB RAM) because no resource allocation option was provided. Please read the Resource Allocation section.
In the previous example, the two job steps each launched one instance of their associated command. However, slurm can also easily be used to run multiple instances of a command in parallel. Here is where “tasks” enter the picture. Many slurm resource requests can be phrased as per-task, and then the allocation is multiplied by however many tasks the job requests. Then, each job step will automatically run multiple instances of the provided command in parallel- one for each task. For example:
$ cat sample_script.sh #!/bin/bash -l #SBATCH -o std_out #SBATCH -o std_err #SBATCH --ntasks=2 #SBATCH --cpus-per-task=1 srun sh some_file.sh
This script will run two instances of “some_file.sh” in parallel. The ntasks
and cpus-per-task
options are important here: they tell Slurm to allocate one CPU for each of the two tasks (a total of two CPUs).
In addition to a single step running two tasks in parallel, two steps can be run in parallel as well:
$ cat sample_script.sh #!/bin/bash #SBATCH -o std_out #SBATCH -e std_err #SBATCH --ntasks=2 #SBATCH --cpus-per-task=1 srun --ntasks=1 --exact python some_file.py & srun --ntasks=1 --exact sh some_file.sh & wait
Here the two srun
instances are both run in the background, which allows them to start at the same time. Like the previous example, the job overall requests two tasks. However, each job step only requests one task (srun –ntasks=1 …
). Since both steps' allocations can be satisfied at the same time, slurm allows them both to start running in parallel. Since this job does not request resources on more than one node, the –exact
option is needed: without it, an srun step will “round up” to consuming whole nodes from whatever is allocated to the job.
Alternately, the allocation could be requested in terms of nodes (with, implicitly, one task per node):
$ cat sample_script.sh #!/bin/bash #SBATCH -o std_out #SBATCH -e std_err #SBATCH --nodes=2 #SBATCH --cpus-per-task=1 srun --nodes=1 python some_file.py & srun --nodes=1 sh some_file.sh & wait
In this case –exact
is not needed as each step will execute on a separate node in the cluster, so the steps are free to (and will) consume each node's whole allocation to the job.
It is important to specify the amount of CPU, GPU, and RAM required when submitting your job, as slurm will enforce that only these resources are available to you. Additionally, slurm will block other jobs from accessing the resources that have been allocated to your job. To specify resource allocation, please add the following options to your sbatch
script:
–cpus-per-task=[ncpus]
would allocate [ncpus]
processors per task.–mem=[size][units]
would allocate amount of memory per node. For example, –mem=100GB
would allocate 100GB.–gpus=[number]
. For example, –gpus=8
would allocate 8 GPUs for the job (and thus nodes with less than 8 GPUs will not qualify).
Slurm has a notion of “Partitions”, which determine which gaivi nodes your job can submit to (among other things). Each user has access to a specific set of partitions on the cluster. sinfo
will show you which partitions you have access to and some basic information about those partitions, e.g.
$ sinfo PARTITION AVAIL TIMELIMIT NODES STATE NODELIST ScoreLab up infinite 1 idle GPU6 general* up 7-00:00:00 2 idle GPU[6,8]
“TIMELIMIT” refers to the maximum amount time a job submitted to that partition can execute. “NODES”, “STATE”, and “NODELIST” tell you about the specific gaivi nodes usable from that partition. A partition will have multiple lines if it has multiple nodes and those nodes are in different STATEs, with each line corresponding to each unique STATE of a node in that partition.
Priority Levels are also determined by partitions. A job in a higher priority partition will jump up in the queue and potentially kill (“preempt”) a running job with lower priority to make the prioritized job run immediately. When a job is preempted, it will be put back in the job queue to be retried later. We recommend writing your jobs to checkpoint regularly and be ready for restarts, unless you are certain your job is running with the highest priority available on its node.
We have five main partitions and a handful of group/lab specific partitions:
By default, jobs are submitted to the general partition. To submit to an alternate partition, include the “-p” option in your submission. For example, in an SBATCH script
#SBATCH -p ScoreLab
From highest to lowest priority:
The following SBATCH options are recommended for a clean work environment and for avoiding errors:
#SBATCH -o std_out #SBATCH -e std_err #SBATCH -p Quick #SBATCH --cpus-per-task=32 ### 32 CPUs per task #SBATCH --mem=100GB ### 100GB per task #SBATCH --gpus-per-task=8 ### 8 GPUs per task
To run pending and running jobs, run:
$ squeue
To view status of a job, run:
$ squeue -j [jobID]
In the output table, the ST column shows the status of the job.
The meaning of each status code is listed here.
To cancel a job submitted through sbatch
, do:
$ scancel [jobID]
First create an interactive bash shell to a running job:
$ srun --pty --jobid <jobID> /bin/bash
This will open a bash shell on the compute node. Then run commands such as htop
to check for CPU/memory consumption, nvidia-smi
to check GPU consumption.
For example,
$ srun --pty --jobid 4384 --interactive /bin/bash $ htop $ nvidia-smi $ exit
Please remember to run the exit
command to log out of the compute node.
Anaconda automatically manages Python machine learning libraries. For example, running TensorFlow jobs is simple; your sbatch script file should look like this:
$ cat sample_script.sh #!/bin/bash -l conda activate tensorflow_environment python a_python_script.py
The location of Anaconda and its base environment is located at /apps/anaconda3. If you wish to use some specific or older versions of Python packages, you must create a new Anaconda environment, which will isolate your desired package versions just for you. To learn how to use Anaconda environments, please check this manual.
Even though the head node does not have GPUs, it has CUDA installed. You can set your environment variables and compile your code, and after the binary file is generated, you can write your job submission script.
Note that you can add cudaSetDevice(<deviceID>) to your source code to specify which GPU cards will be used for your job. If you don't include the line in your source code, by default, your job will be running on the first GPU card (device 0).
If nobody sets devices, all the jobs will flood onto device 0 of each GPU node. This may leave many resources unused since we have at least 4 GPUs on each node.
The following commands may be used for compiling your GPU code:
$ export PATH=$PATH:/share/apps/cuda/cuda-10.2/bin $ export LD_LIBRARY_PATH=$LD_LIBRARY_PATH:/share/apps/cuda/cuda-10.2/lib64/ $ nvcc -o output src.cu
You may submit jobs that run on containerized environments. We have installed Singularity for users who would like to work on container environments.
STEP 1: First, you need to create a create a container image on your local machine (outside of GAIVI). Here is an example Dockerfile:
$ cat Dockerfile # select an image from Docker Hub as base FROM python:3 RUN pip install numpy RUN pip install tensorflow-gpu # install some packages on the image
More example building containers can be found online. Here is an example with Tensorflow. You can also download an image from Docker Hub if you believe that image has everything you need:
$ singularity pull tensorflow/tensorflow:latest-gpu
If you download an image from Docker Hub, you can skip the build step (step 2 below) and go to step 3 for running code on the container.
STEP 2: Next, you need to build a container image from the Dockerfile. We have provided a build service to simplify this process. You can access the build service from GAIVI:
$ container-builder-client Dockerfile myImage.sif --overwrite
Alternatively, you can build an image on your local computer then transfer it to GAIVI:
$ sudo docker build . $ scp
STEP 3: Execute the program(s) in the container
$ srun singularity exec --nv <image name> <command> # For example, $ srun singularity run --nv tensorflow_latest-gpu.sif python code.py arg1 arg2
Jupyterhub is the most convenient way to create a Jupyter Notebook job. Simply go to Jupyter-Hub-Gaivi on your web browser, login with your NetID, request resources for your notebook, and launch it.
After launching a notebook, the notebook’s log file will be created in your working directory, named “jupyterhub_slurmspawner_{jobid}.log”.
If you need to customize Slurm more than what is allowed through Jupyterhub, you can create one manually as follows. Warning: this method is not totally secure: the connection between the login node and your computer/laptop is secure, but the connection between the compute node and the login node is not. Your Jupyterlab job connection can be eavesdropped by another GAIVI user, but not from an outside attacker.
First, you need to find an idle compute node by running sinfo. For example, you find that GPU14 is idle and would like to run Jupyter Lab on it. Please adjust the following command to your situation
$ srun --partition Quick -w GPU14 --gpus=2 --mem=8G --cpus-per-task=8 jupyter lab --ip='0.0.0.0' --port=8888 --NotebookApp.token='some-token' --NotebookApp.password='some-password'
This will create a job on GPU14 that runs Jupyter Lab and can be connected from the login node
Next, you will need to forward the Jupyter Lab connection to your local computer/laptop through SSH Tunnel. Open a new terminal on your local machine and run
$ ssh -L8888:GPU14:8888 <YourNetID>@gaivi.cse.usf.edu
Then open a browser on your local machine and navigate to localhost:8888. Enter the token you provided, e.g. 'some-token'.
The connection between the head node and the compute node is not encrypted (http and not https). Other users may be able to snoop this connection to obtain your token/password and then interfere with your Jupyter Lab session. The connection between the head node and your machine is secured.
The default kernel is defined system-wide for all users. We will not be modifying this kernel for your needs. Instead, please register a kernel of your own from a custom conda environment and install packages and libraries on that conda environment. The commands are as follows:
$ conda create –n newEnv $ conda activate newEnv $ conda install -c conda-forge jupyterlab $ python -m ipykernel install --user --name {kernelName} --display-name {kernelName}
TBA