CLI Reference

Command line interface for Hotdata.

Install

Homebrew

brew install hotdata-dev/tap/cli

Shell (macOS, Linux)

curl -fsSL https://github.com/hotdata-dev/hotdata-cli/releases/latest/download/hotdata-cli-installer.sh | sh

Update

hotdata update

Connect

Authenticate via browser:

hotdata auth

This launches a browser window where you can sign in and authorize the CLI. To create a new account:

hotdata auth register

Verify you're logged in:

hotdata auth status

Alternatively, pass an API key directly:

hotdata <command> --api-key <api_key>

Or set the HOTDATA_API_KEY environment variable (also loaded from .env files):

export HOTDATA_API_KEY=<api_key>
hotdata <command>

API key priority (lowest to highest): config file → HOTDATA_API_KEY env var → --api-key flag.

Commands

Command	Subcommands	Description
`auth`	`login`, `register`, `logout`, `status`	Authenticate (run without subcommand to log in)
`workspaces`	`list`, `set`	Manage workspaces
`connections`	`list`, `create`, `refresh`, `new`	Manage connections
`tables`	`list`	List tables and columns in connected sources
`databases`	`list`, `show`, `create`, `set`, `delete`, `load`, `tables`	Managed databases — create, load parquet, query
`datasets`	`list`, `create`, `update`, `refresh`	Derived views (virtual SQL tables) built from queries
`query`	`status`	Execute a SQL query
`queries`	`list`	Inspect query run history
`sandbox`	`list`, `new`, `set`, `read`, `update`, `run`	Short-lived contexts for exploratory work
`search`		Full-text or vector search across a table column
`indexes`	`list`, `create`, `delete`	Manage indexes on a table
`embedding-providers`	`list`, `get`, `create`, `update`, `delete`	Embedding providers for vector indexes
`context`	`list`, `show`, `pull`, `push`	Sync database context with local Markdown files
`results`	`list`	Retrieve stored query results
`jobs`	`list`	Monitor background jobs
`skills`	`install`, `status`	Manage the hotdata agent skill
`completions`		Generate shell completions
`update`		Update the CLI to the latest release

Global options

Option	Description
`--api-key`	API key (overrides env var and config)
`--no-input`	Disable interactive prompts; error instead
`-v, --version`	Print version
`-h, --help`	Print help

Workspaces

hotdata workspaces list
hotdata workspaces set [<workspace_id>]

list shows all workspaces with a * marker on the active one.
set switches the active workspace. Omit the ID for interactive selection.
The active workspace is used as the default for all commands that accept -w.

Connections

hotdata connections list [-w <id>] [-o table|json|yaml]
hotdata connections <connection_id> [-w <id>] [-o table|json|yaml]
hotdata connections refresh <connection_id> [-w <id>]
hotdata connections new [-w <id>]

list returns id, name, source_type for each connection.
Pass a connection ID to view details.
refresh triggers a schema refresh for a connection.
new launches an interactive connection creation wizard.

Create a connection

# List available connection types
hotdata connections create list

# Inspect schema for a connection type
hotdata connections create list <type_name> --format json

# Create a connection
hotdata connections create \
  --name "my-conn" \
  --type postgres \
  --config '{"host":"...","port":5432,...}'

Tables

hotdata tables list \
  [-w <id>] \
  [--connection-id <id>] \
  [--schema <pattern>] \
  [--table <pattern>] \
  [--limit <n>] \
  [--cursor <token>] \
  [-o table|json|yaml]

Without --connection-id: lists all tables across connections with table, synced, last_sync.
With --connection-id: includes column details (column, data_type, nullable).
--schema and --table support SQL % wildcard patterns.
Tables are addressed as <connection>.<schema>.<table> in SQL queries.

Databases

Managed databases are Hotdata-owned catalogs you populate with parquet files. Tables are addressed as default.<schema>.<table> in SQL; use --database with hotdata query to scope the query.

hotdata databases list [-w <id>] [-o table|json|yaml]
hotdata databases show <name_or_id> [-w <id>]
hotdata databases create \
  [--name <label>] \
  [--table <name>]... \
  [--schema <schema>] \
  [--expires-at <duration>] \
  [-w <id>]
hotdata databases set <name_or_id>
hotdata databases delete <name_or_id>

list shows all databases with their ID and description.
show displays ID, description, default connection, and attached catalogs.
create creates a new database. --table (repeatable) declares tables up front. --expires-at accepts a relative duration (24h, 7d) or RFC 3339 timestamp; defaults to 24 h when omitted.
set marks a database as the default for subsequent commands.
delete removes the database and all its tables.

Load parquet into a table

# Dot notation: database.table or database.schema.table
hotdata databases load <database>.<table> --file <path.parquet>
hotdata databases load <database>.<table> --url <https://...>

The table must have been declared at create time (--table).
--file uploads from a local path; --url downloads a remote parquet file.
Load replaces the table contents on each call.

List and delete tables

hotdata databases tables [<database>] [-o table|json|yaml]
hotdata databases tables delete <table> [--database <name_or_id>]

Query a managed database

Pass --database to scope a query to a specific managed database:

hotdata query \
  "SELECT * FROM default.public.orders LIMIT 10" \
  --database mydb

Datasets

Datasets are derived views — virtual SQL tables built from a query over your data.

hotdata datasets list [-w <id>] [-o table|json|yaml]
hotdata datasets <dataset_id> [-w <id>] [-o table|json|yaml]
hotdata datasets create \
  --name <table_name> \
  [--sql "SELECT ..."] \
  [--query-id <id>] \
  [--description "..."]
hotdata datasets update <dataset_id> [--name <name>] [--description "..."]
hotdata datasets refresh <dataset_id>

Datasets are queryable as datasets.main.<table_name>.
--sql creates the dataset from an inline SQL query; --query-id uses a saved query.
refresh re-runs the source query and creates a new version.

Query

hotdata query "<sql>" \
  [-w <id>] \
  [--connection <connection_id>] \
  [--database <name_or_id>] \
  [-o table|json|csv]
hotdata query status <query_run_id> [-o table|json|csv]

Default output is table, which prints results with row count and execution time.
Use --connection to scope to a specific connection.
Use --database to query a managed database (default.<schema>.<table> in SQL).
Long-running queries fall back to async execution and return a query_run_id.
Use hotdata query status <query_run_id> to poll for results.
Exit codes for query status: 0 = succeeded, 1 = failed, 2 = still running.

Query Run History

hotdata queries list \
  [--limit <n>] \
  [--cursor <token>] \
  [--status <csv>] \
  [-o table|json|yaml]
hotdata queries <query_run_id> [-o table|json|yaml]

list shows past query executions with status, creation time, duration, row count, and a truncated SQL preview (default limit 20).
--status filters by run status (comma-separated, e.g. --status running,failed).
View a run by ID to see full metadata (timings, result_id, SQL).
Retrieve rows for a completed run with hotdata results <result_id>.

Sandboxes

Sandboxes group exploratory CLI work under one context. Datasets created inside an active sandbox are removed when the sandbox ends.

If HOTDATA_SANDBOX is already set (for example after sandbox run), do not start another sandbox or switch sandboxes — nested sandbox operations will fail.

hotdata sandbox list [-w <id>] [-o table|json|yaml]
hotdata sandbox <sandbox_id> [-w <id>] [-o table|json|yaml]
hotdata sandbox new [--name "Sandbox Name"] [-o table|json|yaml]
hotdata sandbox set [<sandbox_id>]
hotdata sandbox read
hotdata sandbox update \
  [<sandbox_id>] \
  [--name "New Name"] \
  [--markdown "..."] \
  [-o table|json|yaml]
hotdata sandbox run <cmd> [args...]
hotdata sandbox <sandbox_id> run <cmd> [args...]

new creates a sandbox and sets it as active.
set switches the active sandbox; omit the ID to clear.
read prints the markdown body of the current sandbox.
update changes name or markdown. Use --markdown for running notes across steps.
run runs a command with HOTDATA_SANDBOX and HOTDATA_WORKSPACE set for the child process.

Search

# Full-text search (requires a BM25 index on the column)
hotdata search "query text" \
  --table <connection.schema.table> \
  [--column <column>] \
  [--select <columns>] \
  [--limit <n>] \
  [-o table|json|csv]

# Vector search (requires a vector index; server resolves model and embedding column)
hotdata search "query text" \
  --table <connection.schema.table> \
  --type vector \
  [--column <source_text_column>] \
  [--limit <n>]

--type is optional when the table has exactly one search index — inferred automatically. Required when multiple indexes exist.
--column is optional when the table has exactly one indexed column of the resolved type.
For --type vector, --column names the source text column; the server resolves the embedding column and model from the index metadata.
Full-text results are ordered by relevance score (descending). Vector results are ordered by distance (ascending).
--select specifies columns to return (comma-separated, defaults to all).

Indexes

# Connection-scoped index using bracket notation
hotdata indexes create 'connection.schema.table[col1,col2]' \
  --type bm25|vector|sorted \
  [--metric l2|cosine|dot] \
  [--embedding-provider-id <id>] \
  [--async]

# Dataset-scoped index
hotdata indexes create \
  --dataset-id <id> \
  --columns <col1,col2> \
  --type bm25|vector|sorted

hotdata indexes list [-w <id>] [-o table|json|yaml]
hotdata indexes delete <index_id> [-w <id>]

Quote the target to prevent shell glob expansion: 'airbnb.listings[description]'.
--type bm25 creates a full-text index; --type vector creates a vector index (requires --metric).
--embedding-provider-id enables automatic embedding generation on a vector index over a text column.
--async submits index creation as a background job.

Embedding providers

Embedding providers power automatic embedding generation for vector indexes.

hotdata embedding-providers list [-w <id>] [-o table|json|yaml]
hotdata embedding-providers get <id> [-w <id>]
hotdata embedding-providers create \
  --name <name> \
  --provider-type local|service \
  [--config '{"model":"..."}'] \
  [--provider-api-key <key>] \
  [-w <id>]
hotdata embedding-providers update <id> \
  [--name <name>] \
  [--config '{"model":"..."}'] \
  [-w <id>]
hotdata embedding-providers delete <id> [-w <id>]

--provider-type local uses a local embedding model; service calls an external API (e.g. OpenAI).
--provider-api-key auto-creates a managed secret for the provider's API key.

Context

Sync named Markdown context files with a managed database. Useful for giving agents persistent notes and schema documentation scoped to a database.

hotdata context list [-w <id>] [-d <database_id>]
hotdata context show <name> [-w <id>] [-d <database_id>]
hotdata context pull <name> [--force] [-w <id>] [-d <database_id>]
hotdata context push <name> [-w <id>] [-d <database_id>]

list shows all named contexts stored in the database.
pull downloads context to ./<NAME>.md. Use --force to overwrite an existing file.
push uploads ./<NAME>.md to the database as named context.
<name> is case-insensitive; a trailing .md is ignored (e.g. USER.md → USER).

Results

hotdata results <result_id> [-w <id>] [-o table|json|csv]
hotdata results list [-w <id>] [--limit <n>] [--offset <n>] [-o table|json|yaml]

Every query result is stored automatically — use result-id from the table footer to retrieve it without re-running.

Jobs

hotdata jobs list \
  [-w <id>] \
  [--job-type <type>] \
  [--status <status>] \
  [--all] \
  [--limit <n>] \
  [--offset <n>] \
  [-o table|json|yaml]
hotdata jobs <job_id> [-w <id>] [-o table|json|yaml]

list shows only active jobs (pending and running) by default. Use --all to see all jobs.
--job-type accepts: data_refresh_table, data_refresh_connection, create_index.
--status accepts: pending, running, succeeded, partially_succeeded, failed.

Skills

hotdata skills install
hotdata skills status

Installs or refreshes the hotdata agent skill into agent directories (Claude Code, Cursor, etc.). See Quick Start — Agent skills.

Configuration

Config is stored at ~/.hotdata/config.yml keyed by profile (default: default).

Variable	Description
`HOTDATA_API_KEY`	API key (overrides config file)
`HOTDATA_WORKSPACE`	Workspace ID for the current process
`HOTDATA_SANDBOX`	Sandbox ID for the current process (set by `sandbox run`)