> ## Documentation Index
> Fetch the complete documentation index at: https://hub.hcompany.ai/llms.txt
> Use this file to discover all available pages before exploring further.

# Glossary

Key terms used across the Models API docs, grouped by theme.

## Models and families

| Term                 | Definition                                                                                                                               |
| :------------------- | :--------------------------------------------------------------------------------------------------------------------------------------- |
| Holo3.1              | Latest generation Vision-Language Model (VLM) family for GUI agents that interact with real digital environments (web, desktop, mobile). |
| Holo3.1 family       | Model sizes from 0.8B to 35B-A3B, spanning on-device to server deployments.                                                              |
| Holo3.1-35B-A3B      | Open-source (Apache 2.0) model variant, available in BF16, FP8, NVFP4, and Q4 GGUF for cloud and local inference.                        |
| Holo3                | Prior generation that Holo3.1 builds on.                                                                                                 |
| Holo2                | Earlier generation model that Holo3 improved upon.                                                                                       |
| Qwen/Qwen3.5-35B-A3B | Base model used for fine-tuning Holo3.1-35B-A3B.                                                                                         |
| Surfer-H             | Example computer-use agent built on the Holo model family.                                                                               |

## Capabilities and tasks

| Term                             | Definition                                                                                                                                                                                 |
| :------------------------------- | :----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- |
| Vision-Language Model (VLM)      | A model that understands both visual inputs (like UI screens) and text, so it can interpret interfaces and perform actions.                                                                |
| GUI Agents                       | AI agents that operate graphical user interfaces by observing screens, reasoning about them, and executing actions.                                                                        |
| Computer Use (CU)                | The ability of an AI system to perform tasks on a computer, such as navigating interfaces and executing commands.                                                                          |
| Navigation (in AI agents)        | The process of completing tasks through multi-step reasoning and actions across interfaces.                                                                                                |
| Element Localization             | Single-turn vision task: given a screenshot and a text description of a target UI element, return click coordinates. A grounding primitive that can be used inside larger agent harnesses. |
| Action Grounding                 | Connecting model decisions to actual executable actions in an environment.                                                                                                                 |
| Cross-environment Generalization | Ability to perform well across different platforms (web, desktop, mobile), including unseen environments.                                                                                  |

## Benchmarks

| Term                  | Definition                                                            |
| :-------------------- | :-------------------------------------------------------------------- |
| OSWorld               | Benchmark evaluating performance in real Ubuntu desktop environments. |
| WebVoyager / WebArena | Benchmarks for testing web navigation and task completion abilities.  |
| AndroidWorld          | Benchmark for evaluating performance on mobile environments.          |

## Training and methods

| Term                          | Definition                                                                            |
| :---------------------------- | :------------------------------------------------------------------------------------ |
| Policy Learning               | Training method where the model learns which actions to take in different situations. |
| Supervised Fine-Tuning (SFT)  | Training stage where the model learns from labeled examples.                          |
| Reinforcement Learning (GRPO) | Training method where the model improves through feedback based on its actions.       |
| Synthetic Data                | Artificially generated data used to supplement training.                              |
| Human-Annotated Data          | Data labeled by humans to improve model accuracy.                                     |
| State-of-the-Art (SOTA)       | Performance that is among the best currently available.                               |