This guide provides an overview of how to deploy a Holo model. You must first deploy the Holo model before launching the Surfer-H agent via the Surfer-H-CLI.

Methods

There are different methods and contexts in which to deploy Holo1, including:
MethodPre-requisitesNotes
Local vLLM setupInstall vLLM / Machine with GPUUses the vLLM to download Holo1 from HuggingFace.
Local Docker ContainerInstall Docker / Machine with GPUUses the vllm/vllm-openai:v0.9.1 image.
Amazon SageMakerSubscribe to Holo1 Models on AWS MarketplaceDeploys the Holo1 model via a prebuilt Notebook. No manual or complicated setup required.
For more information, check out the H.AI Cookbook.

Environment setup

Set your environment variables using one of two methods, outlined below: Option 1: Create a .envat the root of this repo
HAI_API_KEY=your_hai_api_key_here
HAI_MODEL_URL=https://your-api-endpoint-url/
HAI_MODEL_NAME=hosted model name, for example Hcompany/Holo1-7B
OPENAI_API_KEY=your_openai_api_key_here
Note: Make sure your .env file ends with a blank (empty) line. This helps ensure the OPENAI_API_KEY and other variables are correctly loaded by the bash scripts. Option 2: Export in your shell profile (for global setup)
Add these to your .zshrc or .bashrc files:
export HAI_API_KEY=...
export HAI_MODEL_URL=...
export HAI_MODEL_NAME=...
export OPENAI_API_KEY=...
When running vllm, you can leave the HAI_API_KEY empty (or set it to any value), and set HAI_MODEL_URL to http://localhost:PORT using the port where your local vllm instance is running.