What is Cassian
Cassian is a GPU cloud platform for ML engineers. Users provision GPU instances from their terminal, run training jobs, and pay by the minute. The CLI handles instance lifecycle, file sync, and remote execution.Architecture
- CLI (
cassian-clinpm package): all user-facing commands - Cloud: API gateway, auth, scheduling, billing
- GPU Nodes: container lifecycle, GPU allocation, file sync, exec
- Base image: Ubuntu 22.04, CUDA 12.4, Python 3, PyTorch pre-installed
CLI Commands
| Command | Purpose |
|---|---|
cassian login | Authenticate via browser (OAuth) |
cassian init | Create cassian.yaml interactively |
cassian up | Provision GPU instance |
cassian down | Stop instance (workspace saved first) |
cassian ssh | Open terminal (bidirectional file sync) |
cassian exec <cmd> | Sync files + run command + print output |
cassian run <cmd> | One-shot: up + exec + down |
cassian status | Show running instances, cost, volumes |
cassian logs | View output from background tasks |
cassian forward | Forward ports (stays alive until ctrl+c) |
cassian sync | Pull remote files to local |
cassian switch | Change GPU type interactively |
cassian volume create/list/delete | Manage persistent volumes |
Full Flag Reference
cassian up
| Flag | Effect |
|---|---|
--gpu <type> | Override GPU type from cassian.yaml for this session |
--no-sync | Skip pushing local files after instance starts |
--no-setup | Skip workspace.setup and auto-detected pip install |
cassian down
| Flag | Effect |
|---|---|
-f, --force | Don’t wait to confirm — exit after sending the stop request |
cassian exec
| Flag | Effect |
|---|---|
--timeout <seconds> | Max execution time (default: 3600) |
--no-sync | Skip file push before running |
-d, --detach | Run in background, return task ID immediately |
-w, --workdir <path> | Working directory inside the container (default: /workspace) |
-i, --instance <name> | Target instance by name — no cassian.yaml needed |
cassian ssh
| Flag | Effect |
|---|---|
--no-sync | Skip file push before connecting |
cassian exec).
cassian sync
| Flag | Effect |
|---|---|
--path <subdir> | Only pull a specific subdirectory (e.g. checkpoints/) |
cassian status
| Flag | Effect |
|---|---|
--json | Machine-readable JSON output |
cassian logs
| Flag | Effect |
|---|---|
-f, --follow | Stream output as it arrives |
--task <id> | Show output from a specific detached task |
-n <lines> | Number of lines to show (default: 100) |
cassian.yaml Reference
name and gpu.
Available GPU Types
| Type | VRAM | Slug |
|---|---|---|
| H100 SXM | 80GB | h100-sxm |
| H100 PCIe | 80GB | h100-pcie |
| A100 SXM | 80GB | a100-sxm |
| A100 PCIe | 80GB | a100-pcie |
| L40S | 48GB | l40s |
| L4 | 24GB | l4 |
| A10G | 24GB | a10g |
| RTX 4090 | 24GB | rtx4090 |
| RTX 3090 | 24GB | rtx3090 |
rtx3090 not RTX 3090 or rtx-3090.
If the requested type is unavailable, cassian up --gpu <other-type> tries a different type without editing the yaml.
Persistence Model
| Path | Survives down + up | Syncs to local | Notes |
|---|---|---|---|
/workspace (code) | Yes | Yes | Saved to cloud on down, restored on up |
/workspace/storage | Yes | No | Cloud-mounted, unlimited, always live |
no_sync folders | Yes | No | Saved to cloud, not synced locally |
exclude folders | No | No | Ephemeral — gone on down |
~/.cache/pip | No | No | Re-installed via setup on every up |
~/.cache/huggingface | No | No | Use /workspace/storage/hf instead |
| Pip packages | No | No | Re-installed via setup on every up |
Pre-installed Packages
- Python 3, pip, conda
- PyTorch (CUDA 12.4), torchvision, torchaudio
- transformers, accelerate, peft, datasets, bitsandbytes, trl
- numpy, pandas, scipy
- jupyter, wandb
- git, vim, htop, tmux, curl, wget
Common Workflows
Fine-tuning
One-shot run
Background training with monitoring
Check an instance from anywhere
Fast restart (env already installed)
Error Messages
| Error | Meaning | Fix |
|---|---|---|
Your session has expired | Token expired | cassian login |
Instance not found | No running instance | cassian up |
An instance with this name already exists | Name conflict | cassian down first |
Invalid configuration | Bad cassian.yaml | Fix the config |
No <type> GPUs available. Currently available: ... | Requested GPU unavailable | cassian up --gpu <other> or cassian switch |
Request timed out | Command took too long | Use --detach for long ops |
Something went wrong (ref: xxx) | Server error | Report with the ref ID |
Idle Auto-Stop
Instances with no GPU or CPU activity for 30 minutes are automatically stopped. Workspace is saved before shutdown.Cost Estimates
| GPU | Rate |
|---|---|
| H100 SXM | $2.69/hr |
| A100 SXM | $1.39/hr |
| L40S | $0.79/hr |
| L4 | $0.44/hr |
| A10G | $0.38/hr |
| RTX 4090 | $0.34/hr |
| RTX 3090 | $0.22/hr |
cassian status shows elapsed time and estimated cost for running instances.
Machine-Readable Status
Tips for AI Agents
- Always check
cassian.yamlexists before running commands - Use
cassian runfor one-off executions — provisions, runs, and tears down automatically - Use
cassian exec --detachfor jobs that outlive the terminal session; follow withcassian logs -f - Large files (models, datasets) belong in
/workspace/storage, not/workspace cassian exec -i <name> --no-synctargets any instance from any directory — no cassian.yaml needed- Exit codes pass through from
execandrun— use them in CI cassian down --forceskips confirmation polling — use in cleanup scriptscassian sync --path checkpoints/pulls just one folder instead of the whole workspace- If
workspace.setupis not set andrequirements.txtexists, pip install runs automatically onup cassian up --no-sync --no-setupis a fast restart when nothing has changedcassian status --jsongives machine-readable instance state- GPU type slugs:
rtx3090,l4,a10g,a100-sxm,h100-sxm(lowercase, no spaces)