Reference for AI Agents

This page gives an AI agent everything it needs to help a user work with Cassian. All CLI commands, flags, workspace model, and common patterns in one place.

What is Cassian

Cassian is a GPU cloud platform for ML engineers. Users provision GPU instances from their terminal, run training jobs, and pay by the minute. The CLI handles instance lifecycle, file sync, and remote execution.

Architecture

Local machine (cassian CLI)
    ↓ HTTPS
Cassian Cloud (scheduling, auth, routing)
    ↓
GPU Nodes (containers — your code, GPUs, persistent volumes)

CLI (cassian-cli npm package): all user-facing commands
Cloud: API gateway, auth, scheduling, billing
GPU Nodes: container lifecycle, GPU allocation, file sync, exec
Base image: Ubuntu 22.04, CUDA 12.4, Python 3, PyTorch pre-installed

CLI Commands

Command	Purpose
`cassian login`	Authenticate via browser (OAuth)
`cassian init`	Create `cassian.yaml` interactively
`cassian up`	Provision GPU instance
`cassian down`	Stop instance (workspace saved first)
`cassian ssh`	Open terminal (bidirectional file sync)
`cassian exec <cmd>`	Sync files + run command + print output
`cassian run <cmd>`	One-shot: up + exec + down
`cassian status`	Show running instances, cost, volumes
`cassian logs`	View output from background tasks
`cassian forward`	Forward ports (stays alive until ctrl+c)
`cassian sync`	Pull remote files to local
`cassian switch`	Change GPU type interactively
`cassian volume create/list/delete`	Manage persistent volumes

Full Flag Reference

cassian up

cassian up [--gpu <type>] [--no-sync] [--no-setup]

Flag	Effect
`--gpu <type>`	Override GPU type from cassian.yaml for this session
`--no-sync`	Skip pushing local files after instance starts
`--no-setup`	Skip `workspace.setup` and auto-detected pip install

cassian down

cassian down [name] [-f]

Flag	Effect
`-f, --force`	Don’t wait to confirm — exit after sending the stop request

cassian exec

cassian exec <command> [--timeout N] [--no-sync] [-d] [-w path] [-i name]

Flag	Effect
`--timeout <seconds>`	Max execution time (default: 3600)
`--no-sync`	Skip file push before running
`-d, --detach`	Run in background, return task ID immediately
`-w, --workdir <path>`	Working directory inside the container (default: `/workspace`)
`-i, --instance <name>`	Target instance by name — no cassian.yaml needed

cassian ssh

cassian ssh [command] [--no-sync]

Flag	Effect
`--no-sync`	Skip file push before connecting

With a command argument: runs non-interactively (same as cassian exec).

cassian sync

cassian sync [--path subdir]

Flag	Effect
`--path <subdir>`	Only pull a specific subdirectory (e.g. `checkpoints/`)

cassian status

cassian status [--json]

Flag	Effect
`--json`	Machine-readable JSON output

cassian logs

cassian logs [-f] [--task <id>] [-n <lines>]

Flag	Effect
`-f, --follow`	Stream output as it arrives
`--task <id>`	Show output from a specific detached task
`-n <lines>`	Number of lines to show (default: 100)

cassian.yaml Reference

name: my-project          # Instance name (unique per user)

gpu:
  count: 1                # Number of GPUs
  type: rtx3090           # GPU type slug (see table below)

disk: 50G                 # Persistent disk size

storage: true             # Enable cloud storage at /workspace/storage

ports:                    # Ports to forward during cassian ssh
  - "8888:8888"           # Jupyter
  - "6006:6006"           # TensorBoard

workspace:
  setup: "pip install -r requirements.txt"  # Runs after container creation
  no_sync:                # Persists to cloud, not synced to local
    - "checkpoints/"
    - "wandb/"
  exclude:                # Ephemeral — not persisted
    - "node_modules/"
    - ".venv/"
    - "__pycache__/"

Most fields are optional. A minimal config needs only name and gpu.

Available GPU Types

Type	VRAM	Slug
H100 SXM	80GB	`h100-sxm`
H100 PCIe	80GB	`h100-pcie`
A100 SXM	80GB	`a100-sxm`
A100 PCIe	80GB	`a100-pcie`
L40S	48GB	`l40s`
L4	24GB	`l4`
A10G	24GB	`a10g`
RTX 4090	24GB	`rtx4090`
RTX 3090	24GB	`rtx3090`

GPU type slugs are lowercase, no spaces: rtx3090 not RTX 3090 or rtx-3090. If the requested type is unavailable, cassian up --gpu <other-type> tries a different type without editing the yaml.

Persistence Model

Path	Survives `down` + `up`	Syncs to local	Notes
`/workspace` (code)	Yes	Yes	Saved to cloud on down, restored on up
`/workspace/storage`	Yes	No	Cloud-mounted, unlimited, always live
`no_sync` folders	Yes	No	Saved to cloud, not synced locally
`exclude` folders	No	No	Ephemeral — gone on down
`~/.cache/pip`	No	No	Re-installed via setup on every up
`~/.cache/huggingface`	No	No	Use `/workspace/storage/hf` instead
Pip packages	No	No	Re-installed via setup on every up

Pre-installed Packages

Python 3, pip, conda
PyTorch (CUDA 12.4), torchvision, torchaudio
transformers, accelerate, peft, datasets, bitsandbytes, trl
numpy, pandas, scipy
jupyter, wandb
git, vim, htop, tmux, curl, wget

Common Workflows

Fine-tuning

cassian up
cassian exec "HF_HOME=/workspace/storage/hf python finetune.py"
cassian sync --path checkpoints/
cassian down

One-shot run

cassian run "python train.py --epochs 500 --output /workspace/checkpoints"
# Instance auto-stops when done

Background training with monitoring

cassian up
cassian exec -d "python train.py --epochs 500"
cassian logs -f
cassian exec --no-sync "nvidia-smi"
cassian down -f

Check an instance from anywhere

cassian exec -i my-training-run --no-sync "tail -f /workspace/loss.log"
cassian exec -i my-training-run --no-sync "nvidia-smi"

Fast restart (env already installed)

cassian up --no-sync --no-setup

Error Messages

Error	Meaning	Fix
`Your session has expired`	Token expired	`cassian login`
`Instance not found`	No running instance	`cassian up`
`An instance with this name already exists`	Name conflict	`cassian down` first
`Invalid configuration`	Bad cassian.yaml	Fix the config
`No <type> GPUs available. Currently available: ...`	Requested GPU unavailable	`cassian up --gpu <other>` or `cassian switch`
`Request timed out`	Command took too long	Use `--detach` for long ops
`Something went wrong (ref: xxx)`	Server error	Report with the ref ID

Idle Auto-Stop

Instances with no GPU or CPU activity for 30 minutes are automatically stopped. Workspace is saved before shutdown.

Cost Estimates

GPU	Rate
H100 SXM	$2.69/hr
A100 SXM	$1.39/hr
L40S	$0.79/hr
L4	$0.44/hr
A10G	$0.38/hr
RTX 4090	$0.34/hr
RTX 3090	$0.22/hr

Billed per minute. cassian status shows elapsed time and estimated cost for running instances.

Machine-Readable Status

cassian status --json | python3 -c "
import json, sys
data = json.load(sys.stdin)
running = [i for i in data['instances'] if i['status'] == 'running']
print([i['name'] for i in running])
"

Tips for AI Agents

Always check cassian.yaml exists before running commands
Use cassian run for one-off executions — provisions, runs, and tears down automatically
Use cassian exec --detach for jobs that outlive the terminal session; follow with cassian logs -f
Large files (models, datasets) belong in /workspace/storage, not /workspace
cassian exec -i <name> --no-sync targets any instance from any directory — no cassian.yaml needed
Exit codes pass through from exec and run — use them in CI
cassian down --force skips confirmation polling — use in cleanup scripts
cassian sync --path checkpoints/ pulls just one folder instead of the whole workspace
If workspace.setup is not set and requirements.txt exists, pip install runs automatically on up
cassian up --no-sync --no-setup is a fast restart when nothing has changed
cassian status --json gives machine-readable instance state
GPU type slugs: rtx3090, l4, a10g, a100-sxm, h100-sxm (lowercase, no spaces)

​What is Cassian

​Architecture

​CLI Commands

​Full Flag Reference

​cassian up

​cassian down

​cassian exec

​cassian ssh

​cassian sync

​cassian status

​cassian logs

​cassian.yaml Reference

​Available GPU Types

​Persistence Model

​Pre-installed Packages

​Common Workflows

​Fine-tuning

​One-shot run

​Background training with monitoring

​Check an instance from anywhere

​Fast restart (env already installed)

​Error Messages

​Idle Auto-Stop

​Cost Estimates

​Machine-Readable Status

​Tips for AI Agents