ReferenceCLI Reference

CLI Reference

Command line interface for Hotdata.

Install

Homebrew

brew install hotdata-dev/tap/cli

Shell (macOS, Linux)

curl -fsSL https://github.com/hotdata-dev/hotdata-cli/releases/latest/download/hotdata-cli-installer.sh | sh

Update

hotdata update

Connect

Authenticate via browser:

hotdata auth

This launches a browser window where you can sign in and authorize the CLI. To create a new account:

hotdata auth register

Verify you're logged in:

hotdata auth status

Alternatively, pass an API key directly:

hotdata <command> --api-key <api_key>

Or set the HOTDATA_API_KEY environment variable (also loaded from .env files):

export HOTDATA_API_KEY=<api_key>
hotdata <command>

API key priority (lowest to highest): config file → HOTDATA_API_KEY env var → --api-key flag.

Commands

CommandSubcommandsDescription
authlogin, register, logout, statusAuthenticate (run without subcommand to log in)
workspaceslist, setManage workspaces
connectionslist, create, refresh, newManage connections
tableslistList tables and columns in connected sources
databaseslist, show, create, set, delete, load, tablesManaged databases — create, load parquet, query
datasetslist, create, update, refreshDerived views (virtual SQL tables) built from queries
querystatusExecute a SQL query
querieslistInspect query run history
sandboxlist, new, set, read, update, runShort-lived contexts for exploratory work
searchFull-text or vector search across a table column
indexeslist, create, deleteManage indexes on a table
embedding-providerslist, get, create, update, deleteEmbedding providers for vector indexes
contextlist, show, pull, pushSync database context with local Markdown files
resultslistRetrieve stored query results
jobslistMonitor background jobs
skillsinstall, statusManage the hotdata agent skill
completionsGenerate shell completions
updateUpdate the CLI to the latest release

Global options

OptionDescription
--api-keyAPI key (overrides env var and config)
--no-inputDisable interactive prompts; error instead
-v, --versionPrint version
-h, --helpPrint help

Workspaces

hotdata workspaces list
hotdata workspaces set [<workspace_id>]
  • list shows all workspaces with a * marker on the active one.
  • set switches the active workspace. Omit the ID for interactive selection.
  • The active workspace is used as the default for all commands that accept -w.

Connections

hotdata connections list [-w <id>] [-o table|json|yaml]
hotdata connections <connection_id> [-w <id>] [-o table|json|yaml]
hotdata connections refresh <connection_id> [-w <id>]
hotdata connections new [-w <id>]
  • list returns id, name, source_type for each connection.
  • Pass a connection ID to view details.
  • refresh triggers a schema refresh for a connection.
  • new launches an interactive connection creation wizard.

Create a connection

# List available connection types
hotdata connections create list

# Inspect schema for a connection type
hotdata connections create list <type_name> --format json

# Create a connection
hotdata connections create \
  --name "my-conn" \
  --type postgres \
  --config '{"host":"...","port":5432,...}'

Tables

hotdata tables list \
  [-w <id>] \
  [--connection-id <id>] \
  [--schema <pattern>] \
  [--table <pattern>] \
  [--limit <n>] \
  [--cursor <token>] \
  [-o table|json|yaml]
  • Without --connection-id: lists all tables across connections with table, synced, last_sync.
  • With --connection-id: includes column details (column, data_type, nullable).
  • --schema and --table support SQL % wildcard patterns.
  • Tables are addressed as <connection>.<schema>.<table> in SQL queries.

Databases

Managed databases are Hotdata-owned catalogs you populate with parquet files. Tables are addressed as default.<schema>.<table> in SQL; use --database with hotdata query to scope the query.

hotdata databases list [-w <id>] [-o table|json|yaml]
hotdata databases show <name_or_id> [-w <id>]
hotdata databases create \
  [--name <label>] \
  [--table <name>]... \
  [--schema <schema>] \
  [--expires-at <duration>] \
  [-w <id>]
hotdata databases set <name_or_id>
hotdata databases delete <name_or_id>
  • list shows all databases with their ID and description.
  • show displays ID, description, default connection, and attached catalogs.
  • create creates a new database. --table (repeatable) declares tables up front. --expires-at accepts a relative duration (24h, 7d) or RFC 3339 timestamp; defaults to 24 h when omitted.
  • set marks a database as the default for subsequent commands.
  • delete removes the database and all its tables.

Load parquet into a table

# Dot notation: database.table or database.schema.table
hotdata databases load <database>.<table> --file <path.parquet>
hotdata databases load <database>.<table> --url <https://...>
  • The table must have been declared at create time (--table).
  • --file uploads from a local path; --url downloads a remote parquet file.
  • Load replaces the table contents on each call.

List and delete tables

hotdata databases tables [<database>] [-o table|json|yaml]
hotdata databases tables delete <table> [--database <name_or_id>]

Query a managed database

Pass --database to scope a query to a specific managed database:

hotdata query \
  "SELECT * FROM default.public.orders LIMIT 10" \
  --database mydb

Datasets

Datasets are derived views — virtual SQL tables built from a query over your data.

hotdata datasets list [-w <id>] [-o table|json|yaml]
hotdata datasets <dataset_id> [-w <id>] [-o table|json|yaml]
hotdata datasets create \
  --name <table_name> \
  [--sql "SELECT ..."] \
  [--query-id <id>] \
  [--description "..."]
hotdata datasets update <dataset_id> [--name <name>] [--description "..."]
hotdata datasets refresh <dataset_id>
  • Datasets are queryable as datasets.main.<table_name>.
  • --sql creates the dataset from an inline SQL query; --query-id uses a saved query.
  • refresh re-runs the source query and creates a new version.

Query

hotdata query "<sql>" \
  [-w <id>] \
  [--connection <connection_id>] \
  [--database <name_or_id>] \
  [-o table|json|csv]
hotdata query status <query_run_id> [-o table|json|csv]
  • Default output is table, which prints results with row count and execution time.
  • Use --connection to scope to a specific connection.
  • Use --database to query a managed database (default.<schema>.<table> in SQL).
  • Long-running queries fall back to async execution and return a query_run_id.
  • Use hotdata query status <query_run_id> to poll for results.
  • Exit codes for query status: 0 = succeeded, 1 = failed, 2 = still running.

Query Run History

hotdata queries list \
  [--limit <n>] \
  [--cursor <token>] \
  [--status <csv>] \
  [-o table|json|yaml]
hotdata queries <query_run_id> [-o table|json|yaml]
  • list shows past query executions with status, creation time, duration, row count, and a truncated SQL preview (default limit 20).
  • --status filters by run status (comma-separated, e.g. --status running,failed).
  • View a run by ID to see full metadata (timings, result_id, SQL).
  • Retrieve rows for a completed run with hotdata results <result_id>.

Sandboxes

Sandboxes group exploratory CLI work under one context. Datasets created inside an active sandbox are removed when the sandbox ends.

If HOTDATA_SANDBOX is already set (for example after sandbox run), do not start another sandbox or switch sandboxes — nested sandbox operations will fail.

hotdata sandbox list [-w <id>] [-o table|json|yaml]
hotdata sandbox <sandbox_id> [-w <id>] [-o table|json|yaml]
hotdata sandbox new [--name "Sandbox Name"] [-o table|json|yaml]
hotdata sandbox set [<sandbox_id>]
hotdata sandbox read
hotdata sandbox update \
  [<sandbox_id>] \
  [--name "New Name"] \
  [--markdown "..."] \
  [-o table|json|yaml]
hotdata sandbox run <cmd> [args...]
hotdata sandbox <sandbox_id> run <cmd> [args...]
  • new creates a sandbox and sets it as active.
  • set switches the active sandbox; omit the ID to clear.
  • read prints the markdown body of the current sandbox.
  • update changes name or markdown. Use --markdown for running notes across steps.
  • run runs a command with HOTDATA_SANDBOX and HOTDATA_WORKSPACE set for the child process.
# Full-text search (requires a BM25 index on the column)
hotdata search "query text" \
  --table <connection.schema.table> \
  [--column <column>] \
  [--select <columns>] \
  [--limit <n>] \
  [-o table|json|csv]

# Vector search (requires a vector index; server resolves model and embedding column)
hotdata search "query text" \
  --table <connection.schema.table> \
  --type vector \
  [--column <source_text_column>] \
  [--limit <n>]
  • --type is optional when the table has exactly one search index — inferred automatically. Required when multiple indexes exist.
  • --column is optional when the table has exactly one indexed column of the resolved type.
  • For --type vector, --column names the source text column; the server resolves the embedding column and model from the index metadata.
  • Full-text results are ordered by relevance score (descending). Vector results are ordered by distance (ascending).
  • --select specifies columns to return (comma-separated, defaults to all).

Indexes

# Connection-scoped index using bracket notation
hotdata indexes create 'connection.schema.table[col1,col2]' \
  --type bm25|vector|sorted \
  [--metric l2|cosine|dot] \
  [--embedding-provider-id <id>] \
  [--async]

# Dataset-scoped index
hotdata indexes create \
  --dataset-id <id> \
  --columns <col1,col2> \
  --type bm25|vector|sorted

hotdata indexes list [-w <id>] [-o table|json|yaml]
hotdata indexes delete <index_id> [-w <id>]
  • Quote the target to prevent shell glob expansion: 'airbnb.listings[description]'.
  • --type bm25 creates a full-text index; --type vector creates a vector index (requires --metric).
  • --embedding-provider-id enables automatic embedding generation on a vector index over a text column.
  • --async submits index creation as a background job.

Embedding providers

Embedding providers power automatic embedding generation for vector indexes.

hotdata embedding-providers list [-w <id>] [-o table|json|yaml]
hotdata embedding-providers get <id> [-w <id>]
hotdata embedding-providers create \
  --name <name> \
  --provider-type local|service \
  [--config '{"model":"..."}'] \
  [--provider-api-key <key>] \
  [-w <id>]
hotdata embedding-providers update <id> \
  [--name <name>] \
  [--config '{"model":"..."}'] \
  [-w <id>]
hotdata embedding-providers delete <id> [-w <id>]
  • --provider-type local uses a local embedding model; service calls an external API (e.g. OpenAI).
  • --provider-api-key auto-creates a managed secret for the provider's API key.

Context

Sync named Markdown context files with a managed database. Useful for giving agents persistent notes and schema documentation scoped to a database.

hotdata context list [-w <id>] [-d <database_id>]
hotdata context show <name> [-w <id>] [-d <database_id>]
hotdata context pull <name> [--force] [-w <id>] [-d <database_id>]
hotdata context push <name> [-w <id>] [-d <database_id>]
  • list shows all named contexts stored in the database.
  • pull downloads context to ./<NAME>.md. Use --force to overwrite an existing file.
  • push uploads ./<NAME>.md to the database as named context.
  • <name> is case-insensitive; a trailing .md is ignored (e.g. USER.mdUSER).

Results

hotdata results <result_id> [-w <id>] [-o table|json|csv]
hotdata results list [-w <id>] [--limit <n>] [--offset <n>] [-o table|json|yaml]
  • Every query result is stored automatically — use result-id from the table footer to retrieve it without re-running.

Jobs

hotdata jobs list \
  [-w <id>] \
  [--job-type <type>] \
  [--status <status>] \
  [--all] \
  [--limit <n>] \
  [--offset <n>] \
  [-o table|json|yaml]
hotdata jobs <job_id> [-w <id>] [-o table|json|yaml]
  • list shows only active jobs (pending and running) by default. Use --all to see all jobs.
  • --job-type accepts: data_refresh_table, data_refresh_connection, create_index.
  • --status accepts: pending, running, succeeded, partially_succeeded, failed.

Skills

hotdata skills install
hotdata skills status

Installs or refreshes the hotdata agent skill into agent directories (Claude Code, Cursor, etc.). See Quick Start — Agent skills.

Configuration

Config is stored at ~/.hotdata/config.yml keyed by profile (default: default).

VariableDescription
HOTDATA_API_KEYAPI key (overrides config file)
HOTDATA_WORKSPACEWorkspace ID for the current process
HOTDATA_SANDBOXSandbox ID for the current process (set by sandbox run)

See also