CLI Reference
Command line interface for Hotdata.
Install
Homebrew
brew install hotdata-dev/tap/cli
Shell (macOS, Linux)
curl -fsSL https://github.com/hotdata-dev/hotdata-cli/releases/latest/download/hotdata-cli-installer.sh | sh
Update
hotdata update
Connect
Authenticate via browser:
hotdata auth
This launches a browser window where you can sign in and authorize the CLI. To create a new account:
hotdata auth register
Verify you're logged in:
hotdata auth status
Alternatively, pass an API key directly:
hotdata <command> --api-key <api_key>
Or set the HOTDATA_API_KEY environment variable (also loaded from .env files):
export HOTDATA_API_KEY=<api_key>
hotdata <command>
API key priority (lowest to highest): config file → HOTDATA_API_KEY env var → --api-key flag.
Commands
| Command | Subcommands | Description |
|---|---|---|
auth | login, register, logout, status | Authenticate (run without subcommand to log in) |
workspaces | list, set | Manage workspaces |
connections | list, create, refresh, new | Manage connections |
tables | list | List tables and columns in connected sources |
databases | list, show, create, set, delete, load, tables | Managed databases — create, load parquet, query |
datasets | list, create, update, refresh | Derived views (virtual SQL tables) built from queries |
query | status | Execute a SQL query |
queries | list | Inspect query run history |
sandbox | list, new, set, read, update, run | Short-lived contexts for exploratory work |
search | Full-text or vector search across a table column | |
indexes | list, create, delete | Manage indexes on a table |
embedding-providers | list, get, create, update, delete | Embedding providers for vector indexes |
context | list, show, pull, push | Sync database context with local Markdown files |
results | list | Retrieve stored query results |
jobs | list | Monitor background jobs |
skills | install, status | Manage the hotdata agent skill |
completions | Generate shell completions | |
update | Update the CLI to the latest release |
Global options
| Option | Description |
|---|---|
--api-key | API key (overrides env var and config) |
--no-input | Disable interactive prompts; error instead |
-v, --version | Print version |
-h, --help | Print help |
Workspaces
hotdata workspaces list
hotdata workspaces set [<workspace_id>]
listshows all workspaces with a*marker on the active one.setswitches the active workspace. Omit the ID for interactive selection.- The active workspace is used as the default for all commands that accept
-w.
Connections
hotdata connections list [-w <id>] [-o table|json|yaml]
hotdata connections <connection_id> [-w <id>] [-o table|json|yaml]
hotdata connections refresh <connection_id> [-w <id>]
hotdata connections new [-w <id>]
listreturnsid,name,source_typefor each connection.- Pass a connection ID to view details.
refreshtriggers a schema refresh for a connection.newlaunches an interactive connection creation wizard.
Create a connection
# List available connection types
hotdata connections create list
# Inspect schema for a connection type
hotdata connections create list <type_name> --format json
# Create a connection
hotdata connections create \
--name "my-conn" \
--type postgres \
--config '{"host":"...","port":5432,...}'
Tables
hotdata tables list \
[-w <id>] \
[--connection-id <id>] \
[--schema <pattern>] \
[--table <pattern>] \
[--limit <n>] \
[--cursor <token>] \
[-o table|json|yaml]
- Without
--connection-id: lists all tables across connections withtable,synced,last_sync. - With
--connection-id: includes column details (column,data_type,nullable). --schemaand--tablesupport SQL%wildcard patterns.- Tables are addressed as
<connection>.<schema>.<table>in SQL queries.
Databases
Managed databases are Hotdata-owned catalogs you populate with parquet files. Tables are addressed as default.<schema>.<table> in SQL; use --database with hotdata query to scope the query.
hotdata databases list [-w <id>] [-o table|json|yaml]
hotdata databases show <name_or_id> [-w <id>]
hotdata databases create \
[--name <label>] \
[--table <name>]... \
[--schema <schema>] \
[--expires-at <duration>] \
[-w <id>]
hotdata databases set <name_or_id>
hotdata databases delete <name_or_id>
listshows all databases with their ID and description.showdisplays ID, description, default connection, and attached catalogs.createcreates a new database.--table(repeatable) declares tables up front.--expires-ataccepts a relative duration (24h,7d) or RFC 3339 timestamp; defaults to 24 h when omitted.setmarks a database as the default for subsequent commands.deleteremoves the database and all its tables.
Load parquet into a table
# Dot notation: database.table or database.schema.table
hotdata databases load <database>.<table> --file <path.parquet>
hotdata databases load <database>.<table> --url <https://...>
- The table must have been declared at create time (
--table). --fileuploads from a local path;--urldownloads a remote parquet file.- Load replaces the table contents on each call.
List and delete tables
hotdata databases tables [<database>] [-o table|json|yaml]
hotdata databases tables delete <table> [--database <name_or_id>]
Query a managed database
Pass --database to scope a query to a specific managed database:
hotdata query \
"SELECT * FROM default.public.orders LIMIT 10" \
--database mydb
Datasets
Datasets are derived views — virtual SQL tables built from a query over your data.
hotdata datasets list [-w <id>] [-o table|json|yaml]
hotdata datasets <dataset_id> [-w <id>] [-o table|json|yaml]
hotdata datasets create \
--name <table_name> \
[--sql "SELECT ..."] \
[--query-id <id>] \
[--description "..."]
hotdata datasets update <dataset_id> [--name <name>] [--description "..."]
hotdata datasets refresh <dataset_id>
- Datasets are queryable as
datasets.main.<table_name>. --sqlcreates the dataset from an inline SQL query;--query-iduses a saved query.refreshre-runs the source query and creates a new version.
Query
hotdata query "<sql>" \
[-w <id>] \
[--connection <connection_id>] \
[--database <name_or_id>] \
[-o table|json|csv]
hotdata query status <query_run_id> [-o table|json|csv]
- Default output is
table, which prints results with row count and execution time. - Use
--connectionto scope to a specific connection. - Use
--databaseto query a managed database (default.<schema>.<table>in SQL). - Long-running queries fall back to async execution and return a
query_run_id. - Use
hotdata query status <query_run_id>to poll for results. - Exit codes for
query status:0= succeeded,1= failed,2= still running.
Query Run History
hotdata queries list \
[--limit <n>] \
[--cursor <token>] \
[--status <csv>] \
[-o table|json|yaml]
hotdata queries <query_run_id> [-o table|json|yaml]
listshows past query executions with status, creation time, duration, row count, and a truncated SQL preview (default limit 20).--statusfilters by run status (comma-separated, e.g.--status running,failed).- View a run by ID to see full metadata (timings,
result_id, SQL). - Retrieve rows for a completed run with
hotdata results <result_id>.
Sandboxes
Sandboxes group exploratory CLI work under one context. Datasets created inside an active sandbox are removed when the sandbox ends.
If HOTDATA_SANDBOX is already set (for example after sandbox run), do not start another sandbox or switch sandboxes — nested sandbox operations will fail.
hotdata sandbox list [-w <id>] [-o table|json|yaml]
hotdata sandbox <sandbox_id> [-w <id>] [-o table|json|yaml]
hotdata sandbox new [--name "Sandbox Name"] [-o table|json|yaml]
hotdata sandbox set [<sandbox_id>]
hotdata sandbox read
hotdata sandbox update \
[<sandbox_id>] \
[--name "New Name"] \
[--markdown "..."] \
[-o table|json|yaml]
hotdata sandbox run <cmd> [args...]
hotdata sandbox <sandbox_id> run <cmd> [args...]
newcreates a sandbox and sets it as active.setswitches the active sandbox; omit the ID to clear.readprints the markdown body of the current sandbox.updatechanges name or markdown. Use--markdownfor running notes across steps.runruns a command withHOTDATA_SANDBOXandHOTDATA_WORKSPACEset for the child process.
Search
# Full-text search (requires a BM25 index on the column)
hotdata search "query text" \
--table <connection.schema.table> \
[--column <column>] \
[--select <columns>] \
[--limit <n>] \
[-o table|json|csv]
# Vector search (requires a vector index; server resolves model and embedding column)
hotdata search "query text" \
--table <connection.schema.table> \
--type vector \
[--column <source_text_column>] \
[--limit <n>]
--typeis optional when the table has exactly one search index — inferred automatically. Required when multiple indexes exist.--columnis optional when the table has exactly one indexed column of the resolved type.- For
--type vector,--columnnames the source text column; the server resolves the embedding column and model from the index metadata. - Full-text results are ordered by relevance score (descending). Vector results are ordered by distance (ascending).
--selectspecifies columns to return (comma-separated, defaults to all).
Indexes
# Connection-scoped index using bracket notation
hotdata indexes create 'connection.schema.table[col1,col2]' \
--type bm25|vector|sorted \
[--metric l2|cosine|dot] \
[--embedding-provider-id <id>] \
[--async]
# Dataset-scoped index
hotdata indexes create \
--dataset-id <id> \
--columns <col1,col2> \
--type bm25|vector|sorted
hotdata indexes list [-w <id>] [-o table|json|yaml]
hotdata indexes delete <index_id> [-w <id>]
- Quote the target to prevent shell glob expansion:
'airbnb.listings[description]'. --type bm25creates a full-text index;--type vectorcreates a vector index (requires--metric).--embedding-provider-idenables automatic embedding generation on a vector index over a text column.--asyncsubmits index creation as a background job.
Embedding providers
Embedding providers power automatic embedding generation for vector indexes.
hotdata embedding-providers list [-w <id>] [-o table|json|yaml]
hotdata embedding-providers get <id> [-w <id>]
hotdata embedding-providers create \
--name <name> \
--provider-type local|service \
[--config '{"model":"..."}'] \
[--provider-api-key <key>] \
[-w <id>]
hotdata embedding-providers update <id> \
[--name <name>] \
[--config '{"model":"..."}'] \
[-w <id>]
hotdata embedding-providers delete <id> [-w <id>]
--provider-type localuses a local embedding model;servicecalls an external API (e.g. OpenAI).--provider-api-keyauto-creates a managed secret for the provider's API key.
Context
Sync named Markdown context files with a managed database. Useful for giving agents persistent notes and schema documentation scoped to a database.
hotdata context list [-w <id>] [-d <database_id>]
hotdata context show <name> [-w <id>] [-d <database_id>]
hotdata context pull <name> [--force] [-w <id>] [-d <database_id>]
hotdata context push <name> [-w <id>] [-d <database_id>]
listshows all named contexts stored in the database.pulldownloads context to./<NAME>.md. Use--forceto overwrite an existing file.pushuploads./<NAME>.mdto the database as named context.<name>is case-insensitive; a trailing.mdis ignored (e.g.USER.md→USER).
Results
hotdata results <result_id> [-w <id>] [-o table|json|csv]
hotdata results list [-w <id>] [--limit <n>] [--offset <n>] [-o table|json|yaml]
- Every query result is stored automatically — use
result-idfrom the table footer to retrieve it without re-running.
Jobs
hotdata jobs list \
[-w <id>] \
[--job-type <type>] \
[--status <status>] \
[--all] \
[--limit <n>] \
[--offset <n>] \
[-o table|json|yaml]
hotdata jobs <job_id> [-w <id>] [-o table|json|yaml]
listshows only active jobs (pendingandrunning) by default. Use--allto see all jobs.--job-typeaccepts:data_refresh_table,data_refresh_connection,create_index.--statusaccepts:pending,running,succeeded,partially_succeeded,failed.
Skills
hotdata skills install
hotdata skills status
Installs or refreshes the hotdata agent skill into agent directories (Claude Code, Cursor, etc.). See Quick Start — Agent skills.
Configuration
Config is stored at ~/.hotdata/config.yml keyed by profile (default: default).
| Variable | Description |
|---|---|
HOTDATA_API_KEY | API key (overrides config file) |
HOTDATA_WORKSPACE | Workspace ID for the current process |
HOTDATA_SANDBOX | Sandbox ID for the current process (set by sandbox run) |
See also
- Quick Start — Install, authenticate, and run your first query
- API Reference — Full HTTP API documentation
- Data Sources — Supported connection types