List models
Endpoints
List models
Programmatic discovery of the served models, their capabilities, pricing, and deprecation dates.
GET
List models
Lists the models currently served by the API, with capabilities, limits, pricing, and lifecycle metadata. Use it to discover model IDs at runtime and to detect upcoming removals via
deprecation_date instead of hardcoding assumptions.
Returns a list object whose data array contains one object per model.
Response
The model ID to pass as
model in chat completions, e.g. holo3-1-35b-a3b.Total context window in tokens.
Hard ceiling on output tokens per request: 4,096 for
holo3-1-35b-a3b, 32,768 for holo3-122b-a10b.["text", "image"] for both Holo models.Capability flags.
reasoning on both models; tools (native function calling) on holo3-1-35b-a3b only.Accepted sampling fields, e.g.
temperature, top_p, top_k, max_tokens, stop, frequency_penalty, presence_penalty, seed.Per-token USD rates as decimal strings:
prompt and completion per input/output token.Set when the model is scheduled for removal;
null otherwise. After removal, requests to the ID fail with model_not_found.Whether the model is currently serving traffic (also see
is_ready).Examples
Response (truncated)