Skip to main content
The H Company Models API gives developers access to the Holo3.1 Vision-Language Models. Through a single OpenAI-compatible API, you can send text, images, or both, and receive structured outputs. The API supports web, desktop, and mobile automation, multimodal agents, and UI testing, whether you are integrating into a product, running research experiments, or building automation workflows.
  • Multimodal input: send text and images together in one request.
  • Structured outputs: receive responses you can act on directly for UI automation and navigation.
You can start building with Holo3.1-35B-A3B today. The API gives you hosted, low-latency access to these models, so you do not have to run the infrastructure yourself.

Two ways to use Holo

Agent loop

Multi-turn control for an autonomous agent.

Element localization

Get click coordinates from a screenshot.

Get started

Quickstart

Run Holo in five minutes.

API reference

Endpoint, models, pricing, and parameters.

Model cards and benchmarks

For model specs, weights, and performance, see the model cards on Hugging Face and the launch blog posts.

Hugging Face

Model cards, weights, and quantized builds.

Holo3.1 release

Mobile, function calling, and local inference.

Holo3 benchmarks

78.85% on OSWorld-Verified.
Prefer to try the models without writing code? HoloTab runs Holo directly in your browser, no setup required.