Skip to main content

Common Errors

Note: In examples below, replace <project-name> with your actual project name. The $(whoami) part auto-substitutes your username.


File Issues

Can't Find My Files

Symptoms:

  • Files missing from container
  • Workspace appears empty

Solutions:

  1. Check both locations:

    # On host
    ls ~/workspace/<project-name>/

    # In container
    docker exec <project-name>._.$(whoami) ls /workspace/
  2. Remember the mapping:

    Host: ~/workspace/<project-name>/
    Container: /workspace/
  3. Verify workspace mount:

    docker inspect <project-name>._.$(whoami) | grep -A 5 "Mounts"

Permission Denied on Files

Symptoms:

$ touch /workspace/file.txt
Permission denied

Solutions:

  1. Check ownership:

    ls -ld ~/workspace/<project-name>/
    # Should be owned by you
  2. Fix permissions (on host):

    sudo chown -R $(whoami):$(whoami) ~/workspace/<project-name>/
  3. Check disk space:

    df -h | grep home

Out of Disk Space

Symptoms:

No space left on device

Solutions:

Best to notify DSL admin by raising an issue ticket in ds01-hub repo. Most user permissions are restricted so you will not be able to do a full clean of docker/disk except from those files related to you (limited).

  1. Check usage:

    # Workspace
    du -sh ~/workspace/*

    # Docker
    docker system df
  2. Clean up:

    # Remove old projects
    rm -rf ~/workspace/old-project/

    # Clean Docker
    docker image prune
    docker system prune

    # Remove old checkpoints
    find ~/workspace -name "checkpoint-*.pt" -mtime +30 -delete

Permission Issues

Docker Permission Denied

Symptoms:

$ docker ps
Permission denied while trying to connect to the Docker daemon socket

Cause: Not in docker group

Solution:

# Check groups
groups | grep docker

If not in docker group, ask DSL admin to add you

Commands Not Found

Symptoms:

$ container-deploy my-project
bash: container-deploy: command not found

Solutions:

  1. Check PATH:

    echo $PATH | grep ds01
  2. Use full path:

    /opt/ds01-infra/scripts/user/orchestrators/container-deploy my-project
  3. Fix PATH:

    shell-setup
    source ~/.bashrc

Network Issues

Can't Access Jupyter

Symptoms:

  • Jupyter running but can't access in browser

Solutions:

  1. Check Jupyter is running:

    docker exec <project-name>._.$(whoami) ps aux | grep jupyter
  2. Check port:

    docker port <project-name>._.$(whoami)
  3. Set up SSH tunnel:

    # On your laptop
    ssh -L 8888:localhost:8888 <user-id>@ds01

    # Then access: http://localhost:8888
  4. Start Jupyter correctly:

    jupyter lab --ip=0.0.0.0 --port=8888 --no-browser

Git Issues

Can't Push to GitHub

Symptoms:

$ git push
Permission denied (publickey)

Solutions:

  1. Check SSH key:

    ls ~/.ssh/
    cat ~/.ssh/id_ed25519.pub
  2. Add key to GitHub:

    • Copy public key
    • GitHub → Settings → SSH Keys → Add
  3. Test connection:

    ssh -T git@github.com
  4. Use HTTPS instead:

    git remote set-url origin https://github.com/user/repo.git

Resource Limits

Memory Limit Exceeded

Symptoms:

  • Container killed
  • OOMKilled in logs

Solutions:

  1. Check limits:

    check-limits
  2. Reduce memory usage:

    • Process data in chunks
    • Use data generators
    • Clear variables when done

Error Message Reference

ErrorMeaningSolution
No GPUs availableAll GPUs allocatedWait or retire old containers
OOMKilledOut of memoryReduce memory usage
Permission deniedNot in docker group or file permissionsCheck groups, fix permissions
Container not foundContainer removed or wrong nameRecreate or check name
Image not foundImage doesn't existBuild image first
Network unreachableNetwork issueCheck network, retry
Quota exceededHit disk quotaClean up old files

See Also