This guide provides an overview of how to deploy a Holo model. You must first deploy the Holo model before launching the Surfer-H agent via the Surfer-H-CLI.

Methods

There are different methods and contexts in which to deploy Holo1, including:

Method	Pre-requisites	Notes
Local vLLM setup	Install vLLM / Machine with GPU	Uses the vLLM to download Holo1 from HuggingFace.
Local Docker Container	Install Docker / Machine with GPU	Uses the `vllm/vllm-openai:v0.9.1` image.
Amazon SageMaker	Subscribe to Holo1 Models on AWS Marketplace	Deploys the Holo1 model via a prebuilt Notebook. No manual or complicated setup required.

For more information, check out the H.AI Cookbook.

Environment setup

Set your environment variables using one of two methods, outlined below: Option 1: Create a .envat the root of this repo

HAI_API_KEY=your_hai_api_key_here
HAI_MODEL_URL=https://your-api-endpoint-url/
HAI_MODEL_NAME=hosted model name, for example Hcompany/Holo1-7B
OPENAI_API_KEY=your_openai_api_key_here

Note: Make sure your .env file ends with a blank (empty) line. This helps ensure the OPENAI_API_KEY and other variables are correctly loaded by the bash scripts. Option 2: Export in your shell profile (for global setup)
Add these to your .zshrc or .bashrc files:

export HAI_API_KEY=...
export HAI_MODEL_URL=...
export HAI_MODEL_NAME=...
export OPENAI_API_KEY=...

When running vllm, you can leave the HAI_API_KEY empty (or set it to any value), and set HAI_MODEL_URL to http://localhost:PORT using the port where your local vllm instance is running.

Surfer-H

​Methods

​Environment setup

Methods

Environment setup