If you’re serving your own model using vLLM, start the server with:
export HAI_API_KEY=EMPTY
export HAI_MODEL_URL=http://localhost:8082/v1
export HAI_MODEL_NAME=Hcompany/Holo1-7B
vllm serve Hcompany/Holo1-7B --port 8082
You can then run the agent from the Surfer-H-CLI using the following command:
./run-on-holo.sh
Here are the different run settings that may apply based on use and context:
  • run-on-holo.sh : Use Holo1 for navigation and localization, hosted remotely.
  • run-on-holo-local.sh : Script with specific instruction for using one or several locally hosted Holo.
  • run-on-holo-val-gpt41.sh : Use remotely-hosted Holo and GPT-4.1 for validation.
The above scripts call the agent like this, with different configurations for the placeholders:
MODEL="<model name for endpoint>"
TASK="Find a beef Wellington recipe with a rating of 4.7 or higher and at least 200 reviews."
URL="https://www.allrecipes.com"

uv run src/surfer_h_cli/surferh.py \
    --task "$TASK" \
    --url "$URL" \
    --max_n_steps 30 \
    --base_url_localization https://<openai-api-compatible-endpoint-such-as-vllm> \
    --model_name_localization $MODEL \
    --temperature_localization 0.0 \
    --base_url_navigation https://<openai-api-compatible-endpoint-such-as-vllm> \
    --model_name_navigation $MODEL \
    --temperature_navigation 0.7 \

Using GPT for Validation

To run run-on-holo-val-gpt41.sh, remember to export your OpenAI API key for validation:
export API_KEY_VALIDATION=${OPENAI_API_KEY}
and define the correct base URL
--base_url_validation https://api.openai.com/v1/