ReferenceCLI Reference

CLI Reference

Command line interface for Hotdata.

Install

Homebrew

brew install hotdata-dev/tap/cli

Shell (macOS, Linux)

curl -fsSL https://github.com/hotdata-dev/hotdata-cli/releases/latest/download/hotdata-cli-installer.sh | sh

Connect

Run the following command to authenticate:

hotdata auth

This launches a browser window where you can authorize the CLI to access your Hotdata account.

Alternatively, authenticate with an API key using the --api-key flag:

hotdata <command> --api-key <api_key>

Or set the HOTDATA_API_KEY environment variable (also loaded from .env files):

export HOTDATA_API_KEY=<api_key>
hotdata <command>

API key priority (lowest to highest): config file → HOTDATA_API_KEY env var → --api-key flag.

Commands

CommandSubcommandsDescription
authstatus, logoutAuthenticate (run without subcommand to log in)
workspaceslist, setManage workspaces
connectionslist, create, refresh, newManage connections
tableslistList tables and columns
datasetslist, createManage uploaded datasets
queryExecute a SQL query
querieslistInspect query run history
sandboxlist, new, set, read, update, runShort-lived contexts for exploratory CLI work
searchFull-text search across a table column
indexeslist, createManage indexes on a table
resultslistRetrieve stored query results
jobslistManage background jobs
skillsinstall, statusManage the hotdata-cli agent skill

Global options

OptionDescriptionTypeDefault
--api-keyAPI key (overrides env var and config)string
-v, --versionPrint versionboolean
-h, --helpPrint helpboolean

Workspaces

hotdata workspaces list [--format table|json|yaml]
hotdata workspaces set [<workspace_id>]
  • list shows all workspaces with a * marker on the active one.
  • set switches the active workspace. Omit the ID for interactive selection.
  • The active workspace is used as the default for all commands that accept --workspace-id.

Connections

hotdata connections list [-w <id>] [-o table|json|yaml]
hotdata connections <connection_id> [-w <id>] [-o table|json|yaml]
hotdata connections refresh <connection_id> [-w <id>]
hotdata connections new [-w <id>]
  • list returns id, name, source_type for each connection.
  • Pass a connection ID to view details (id, name, source type, table counts).
  • refresh triggers a schema refresh for a connection.
  • new launches an interactive connection creation wizard.

Create a connection

# List available connection types
hotdata connections create list [--format table|json|yaml]

# Inspect schema for a connection type
hotdata connections create list <type_name> --format json

# Create a connection
hotdata connections create --name "my-conn" --type postgres --config '{"host":"...","port":5432,...}'

Tables

hotdata tables list [--workspace-id <id>] [--connection-id <id>] [--schema <pattern>] [--table <pattern>] [--limit <n>] [--cursor <token>] [--format table|json|yaml]
  • Without --connection-id: lists all tables with table, synced, last_sync.
  • With --connection-id: includes column details (column, data_type, nullable).
  • --schema and --table support SQL % wildcard patterns.
  • Tables are displayed as <connection>.<schema>.<table> — use this format in SQL queries.

Datasets

hotdata datasets list [--workspace-id <id>] [--limit <n>] [--offset <n>] [--format table|json|yaml]
hotdata datasets <dataset_id> [--workspace-id <id>] [--format table|json|yaml]
hotdata datasets create --file data.csv [--label "My Dataset"] [--table-name my_dataset]
hotdata datasets create --sql "SELECT ..." --label "My Dataset"
hotdata datasets create --url "https://example.com/data.parquet" --label "My Dataset"
  • Datasets are queryable as datasets.main.<table_name>.
  • --file, --sql, --query-id, and --url are mutually exclusive.
  • --url imports data directly from a URL (supports csv, json, parquet).
  • Format is auto-detected from file extension or content.
  • Piped stdin is supported: cat data.csv | hotdata datasets create --label "My Dataset"

Query

hotdata query "<sql>" [-w <id>] [--connection <connection_id>] [-o table|json|csv]
hotdata query status <query_run_id> [-o table|json|csv]
  • Default output is table, which prints results with row count and execution time.
  • Use --connection to scope the query to a specific connection.
  • Long-running queries automatically fall back to async execution and return a query_run_id.
  • Use hotdata query status <query_run_id> to poll for results.
  • Exit codes for query status: 0 = succeeded, 1 = failed, 2 = still running (poll again).

Query Run History

hotdata queries list [--limit <n>] [--cursor <token>] [--status <csv>] [-o table|json|yaml]
hotdata queries <query_run_id> [-o table|json|yaml]
  • list shows past query executions with status, creation time, duration, row count, and a truncated SQL preview (default limit 20).
  • --status filters by run status (comma-separated, e.g. --status running,failed).
  • View a run by ID to see full metadata (timings, result_id, snapshot, hashes) and the formatted, syntax-highlighted SQL.
  • If a run has a result_id, fetch its rows with hotdata results <result_id>.

Sandboxes

Sandboxes group exploratory CLI work under one context. Datasets created inside an active sandbox are tied to that sandbox and are removed when the sandbox ends. Create datasets outside a sandbox when they must persist.

If HOTDATA_SANDBOX is already set (for example after sandbox run), you are inside that sandbox: do not start another sandbox, switch sandboxes, or clear the variable—nested or conflicting sandbox operations will fail.

hotdata sandbox list [-w <id>] [-o table|json|yaml]
hotdata sandbox <sandbox_id> [-w <id>] [-o table|json|yaml]
hotdata sandbox new [--name "Sandbox Name"] [-o table|json|yaml]
hotdata sandbox set [<sandbox_id>]
hotdata sandbox read
hotdata sandbox update [<sandbox_id>] [--name "New Name"] [--markdown "..."] [-o table|json|yaml]
hotdata sandbox run <cmd> [args...]
hotdata sandbox <sandbox_id> run <cmd> [args...]
  • list shows all sandboxes with a * marker on the active one.
  • Pass a sandbox ID (without further subcommands) to view details for that sandbox.
  • new creates a sandbox and sets it as active. Not allowed while already inside a sandbox.
  • set switches the active sandbox; omit the ID to clear. Not allowed while inside a sandbox.
  • read prints the markdown body of the current sandbox.
  • update changes name or markdown for a sandbox (defaults to the active sandbox if no ID is given). Use --markdown for running notes and context across steps.
  • run runs a command with HOTDATA_SANDBOX and HOTDATA_WORKSPACE set for the child process. With sandbox <id> run, uses that sandbox; otherwise creates a new sandbox first. Not allowed while already inside a sandbox.
# Full text search
hotdata search "query text" --table <connection.schema.table> --column <column> [--select <columns>] [--limit <n>] [-o table|json|csv]

# Vector search with --model (calls OpenAI to embed the query)
hotdata search "query text" --table <table> --column <vector_column> --model text-embedding-3-small [--limit <n>]

# Vector search with piped embedding
echo '[0.1, -0.2, ...]' | hotdata search --table <table> --column <vector_column> [--limit <n>]
  • Without --model and with query text: full text search. Requires a full-text index (--type bm25) on the target column.
  • With --model: generates an embedding via OpenAI and performs vector search using l2_distance. Requires OPENAI_API_KEY env var.
  • Without query text and with piped stdin: reads a vector (raw JSON array or OpenAI embedding response) and performs vector search.
  • Full text search results are ordered by relevance score (descending). Vector results are ordered by distance (ascending).
  • --select specifies which columns to return (comma-separated, defaults to all).

Indexes

hotdata indexes list --connection-id <id> --schema <schema> --table <table> [--workspace-id <id>] [--format table|json|yaml]
hotdata indexes create --connection-id <id> --schema <schema> --table <table> --name <name> --columns <cols> [--type sorted|bm25|vector] [--metric l2|cosine|dot] [--async]
  • list shows indexes on a table with name, type, columns, status, and creation date.
  • create creates an index. Use --type bm25 for full-text search, --type vector for vector search (requires --metric).
  • --async submits index creation as a background job.

Results

hotdata results <result_id> [--workspace-id <id>] [--format table|json|csv]
hotdata results list [--workspace-id <id>] [--limit <n>] [--offset <n>] [--format table|json|yaml]
  • Query results include a result-id in the table footer — use it to retrieve past results without re-running queries.

Jobs

hotdata jobs list [--workspace-id <id>] [--job-type <type>] [--status <status>] [--all] [--limit <n>] [--offset <n>] [--format table|json|yaml]
hotdata jobs <job_id> [--workspace-id <id>] [--format table|json|yaml]
  • list shows only active jobs (pending and running) by default. Use --all to see all jobs.
  • --job-type accepts: data_refresh_table, data_refresh_connection, create_index.
  • --status accepts: pending, running, succeeded, partially_succeeded, failed.

Configuration

Config is stored at ~/.hotdata/config.yml keyed by profile (default: default).

VariableDescriptionDefault
HOTDATA_API_KEYAPI key (overrides config file)
HOTDATA_SANDBOXSandbox ID for the current process (set by sandbox run; associates API requests with that sandbox)
HOTDATA_WORKSPACEWorkspace ID for the current process (set alongside HOTDATA_SANDBOX by sandbox run)

See also