Working with GPUs
Practical guide to GPU usage on DS01.
Requesting GPUs
NB: GPU quotas vary by group — run check-limits to see yours. Researchers and faculty have larger GPU-equivalent quotas than the student default.
# Request 1 GPU (default)
container-deploy my-project
# Request multiple GPU-slots (during container-create)
container-create my-project --num-gpus 2
# Prefer a full GPU over a MIG slice (only meaningful when MIG is enabled)
container-create my-project --prefer-full
Note: GPU allocation options (--num-gpus, --prefer-full) are set during container-create. The container-deploy orchestrator uses your default allocation settings.
Monitoring GPU Usage
Inside container:
# Basic info
nvidia-smi
# Continuous monitoring
watch -n 1 nvidia-smi
# Memory usage
nvidia-smi --query-gpu=memory.used,memory.total --format=csv
Using GPUs in Code
PyTorch:
import torch
device = torch.device('cuda' if torch.cuda.is_available() else 'cpu')
model = model.to(device)
data = data.to(device)
TensorFlow:
import tensorflow as tf
# List GPUs
print(tf.config.list_physical_devices('GPU'))
# Use GPU
with tf.device('/GPU:0'):
# Your code
Optimising GPU Usage
Use mixed precision:
from torch.cuda.amp import autocast, GradScaler
scaler = GradScaler()
with autocast():
output = model(input)
Batch size tuning:
- Start small, increase until GPU memory ~80% full
- Monitor with nvidia-smi