Container Issues
Solutions for container startup, runtime, and lifecycle problems.
Note: In examples below, replace
<project-name>with your actual project name. The$(whoami)part auto-substitutes your username.
Container Won't Start
Symptoms:
$ container-start my-project
Error: Container failed to start
Causes:
- GPU no longer exists (was reallocated)
- Resource limits exceeded
- Container configuration issue
Solutions:
-
Check container status:
docker ps -a | grep <project-name>docker logs <project-name>._.$(whoami) -
Recreate container:
container-remove my-projectcontainer-create my-project -
Check resource limits:
check-limits
Container Stopped Unexpectedly
Symptoms:
- Container was running, now shows as stopped
- Processes terminated
Causes:
- Idle timeout reached (30min-2h, varies by user)
- Out of memory (OOM killer)
- Max runtime exceeded
- Code crashed
Solutions:
-
Check logs:
docker logs <project-name>._.$(whoami) | tail -100 -
Check for OOM:
docker inspect <project-name>._.$(whoami) | grep OOMKilled -
Prevent idle timeout:
touch ~/workspace/<project-name>/.keep-aliveContact DSL First: The
.keep-aliveworkaround is available but should be a last resort as it can disrupt the system for other users. Please open an issue on DS01 Hub first to find a better solution together. -
Restart container:
container-start my-projectOr recreate:
container-retire my-projectcontainer-deploy my-project
Can't Find Container
Symptoms:
$ container-run my-project
Error: Container not found
Causes:
- Container was removed
- Wrong project name
- Container never created
Solutions:
-
List containers:
container-list --alldocker ps -a --filter "name=._.$(whoami)" -
Recreate if needed:
container-deploy my-project -
Check project name spelling
Container Performance
Symptoms:
- Container running slowly
- High latency
Solutions:
-
Check resource usage:
container-stats my-project -
Check if swapping:
docker exec <project-name>._.$(whoami) free -h -
Check GPU utilisation:
docker exec <project-name>._.$(whoami) nvidia-smi -
Reduce memory pressure:
- Close unused processes
- Use smaller batch sizes
- Process data in chunks
Container Won't Stop
Symptoms:
container-stophangs- Container stuck in stopping state
Solutions:
-
Force stop:
docker stop -t 1 <project-name>._.$(whoami) -
Force kill:
docker kill <project-name>._.$(whoami) -
Remove forcefully:
container-remove my-project --force
Maximum Containers Reached
Symptoms:
$ container-deploy new-project
Error: Maximum containers reached (3/3)
Solution:
# Check current containers
container-list
# Retire unused containers
container-retire old-project-1
container-retire old-project-2
# Now can deploy new one
container-deploy new-project
General Recovery
When in doubt, recreate:
container-retire my-project
container-deploy my-project --open
Your workspace files are always safe - recreating won't lose data.