API reference - H Tech Hub

The Models API is OpenAI-compatible: point the official OpenAI client (or any compatible library) at H Company’s endpoint. You opt into Holo-specific behavior (structured outputs, reasoning, and the coordinate convention) through a few extra request fields and conventions documented here.

Endpoints

POST /chat/completions

The inference endpoint: parameters, response fields, streaming.

GET /models

Discover served models, limits, pricing, and deprecation dates at runtime.

Endpoint and auth


Base URL	`https://api.hcompany.ai/v1/`
Auth	`Authorization: Bearer $HAI_API_KEY` (handled by the OpenAI client)
Keys	Create one on Portal-H

import os
from openai import OpenAI

client = OpenAI(
    base_url="https://api.hcompany.ai/v1/",
    api_key=os.environ["HAI_API_KEY"],
)

import OpenAI from "openai";

const client = new OpenAI({
  baseURL: "https://api.hcompany.ai/v1/",
  apiKey: process.env.HAI_API_KEY,
});

Model IDs, per-model limits, pricing, and tiers live on the Models page.

The two response channels

Holo returns two streams on every call:

choices[].message.content

string

The action: the structured JSON object (structured-output mode) or the assistant text.

choices[].message.reasoning

string

The thinking trace, when thinking is enabled. Read it for visibility; do not feed it back into the conversation.

The thinking trace is dropped between turns by the chat template Holo inherits. Anything the model must remember has to flow through content. See the Agent loop for how to carry state forward.

Conventions

Coordinates in [0, 1000]

Holo returns click positions as integers normalized to the image you sent. Scale back to pixels with the image’s own dimensions. Origin is top-left. Send and scale against the same image bytes: any resize, crop, or DPI mismatch will misplace the point.

Image budget

Keep at most the last 3 screenshots in context for best accuracy, even though a request accepts up to 5 images. See the trim helper in the agent loop.

Output formats

Structured outputs work on both models; native function calling (tools / tool_calls) is holo3-1-35b-a3b only. Pick one and stay in it: Agent loop.

Holo-specific fields and the OpenAI SDKs

structured_outputs and chat_template_kwargs are top-level body fields on the wire. The OpenAI SDKs do not know them, so pass them via extra_body (Python) or an untyped spread (TypeScript); the SDK merges them into the request body.

Next steps

Chat completions

Full parameter and response reference.

Models

IDs, limits, pricing, lifecycle.

Agent loop

How to use Holo in your computer-use harness.

Chat completions

​Endpoints

POST /chat/completions

GET /models

​Endpoint and auth

​The two response channels

​Conventions

​Next steps

Chat completions

Models

Agent loop

Endpoints

Endpoint and auth

The two response channels

Conventions

Next steps