List models

curl --request GET \
  --url https://api.hcompany.ai/v1/models \
  --header 'Authorization: Bearer <token>'

{
  "data[].id": "<string>",
  "data[].context_length": 123,
  "data[].max_output_length": 123,
  "data[].input_modalities": [
    {}
  ],
  "data[].supported_features": [
    {}
  ],
  "data[].supported_sampling_parameters": [
    {}
  ],
  "data[].pricing": {},
  "data[].deprecation_date": "<string>",
  "data[].is_active": true
}

GET

models

List models

curl --request GET \
  --url https://api.hcompany.ai/v1/models \
  --header 'Authorization: Bearer <token>'

{
  "data[].id": "<string>",
  "data[].context_length": 123,
  "data[].max_output_length": 123,
  "data[].input_modalities": [
    {}
  ],
  "data[].supported_features": [
    {}
  ],
  "data[].supported_sampling_parameters": [
    {}
  ],
  "data[].pricing": {},
  "data[].deprecation_date": "<string>",
  "data[].is_active": true
}

Lists the models currently served by the API, with capabilities, limits, pricing, and lifecycle metadata. Use it to discover model IDs at runtime and to detect upcoming removals via deprecation_date instead of hardcoding assumptions. Returns a list object whose data array contains one object per model.

Response

data[].id

string

The model ID to pass as model in chat completions, e.g. holo3-1-35b-a3b.

data[].context_length

integer

Total context window in tokens.

data[].max_output_length

integer

Hard ceiling on output tokens per request: 4,096 for holo3-1-35b-a3b, 32,768 for holo3-122b-a10b.

data[].input_modalities

array

["text", "image"] for both Holo models.

data[].supported_features

array

Capability flags. reasoning on both models; tools (native function calling) on holo3-1-35b-a3b only.

data[].supported_sampling_parameters

array

Accepted sampling fields, e.g. temperature, top_p, top_k, max_tokens, stop, frequency_penalty, presence_penalty, seed.

data[].pricing

object

Per-token USD rates as decimal strings: prompt and completion per input/output token.

data[].deprecation_date

string

Set when the model is scheduled for removal; null otherwise. After removal, requests to the ID fail with model_not_found.

data[].is_active

boolean

Whether the model is currently serving traffic (also see is_ready).

Examples

models = client.models.list()
for m in models.data:
    print(m.id)

Response (truncated)

{
  "object": "list",
  "data": [
    {
      "id": "holo3-1-35b-a3b",
      "object": "model",
      "name": "Holo3 1 35B A3B",
      "context_length": 65536,
      "max_output_length": 4096,
      "input_modalities": ["text", "image"],
      "supported_features": ["reasoning", "tools"],
      "supported_sampling_parameters": ["temperature", "top_p", "top_k", "max_tokens", "stop", "frequency_penalty", "presence_penalty", "seed"],
      "pricing": {"prompt": "0.00000025", "completion": "0.0000018"},
      "is_active": true,
      "deprecation_date": null
    }
  ]
}

Chat completions

​Response

​Examples

Response

Examples