Skip to main content

Documentation Index

Fetch the complete documentation index at: https://ahvn.top/llms.txt

Use this file to discover all available pages before exploring further.

Provider chooses where a model is served. Gateway chooses how HeavenBase sends the request.

1. Default route

For new projects, set one OpenRouter key and use presets:
export OPENROUTER_API_KEY="sk-or-v1-..."
import heavenbase as hb

llm = hb.LLM(preset="chat", cache=False)
text = llm.chat("Reply with exactly: hb-ok", max_tokens=8)
preset="chat" resolves to deepseek-v4-flash by default. The bundled default provider is openrouter, and the default gateway is openai, which means HeavenBase uses the OpenAI Python SDK against the provider’s OpenAI-compatible endpoint.

2. Provider examples

Use default_provider when most calls should use one provider:
export OPENROUTER_API_KEY="sk-or-v1-..."
hb cfg set heavenbase.llm.default_provider openrouter
Use an explicit provider for one call when you are comparing routes:
fast = hb.LLM(model="ds-flash", provider="deepseek", cache=False)
claude = hb.LLM(model="haiku", provider="anthropic", cache=False)
glm = hb.LLM(model="glm-flash", provider="openrouter", cache=False)

3. Gateway examples

Use gateway="portkey" when you want routing policy, observability, or gateway-side controls:
export PORTKEY_API_KEY="..."
llm = hb.LLM(preset="chat", gateway="portkey", cache=False)
For short live route checks, this is the recommended cheap route:
llm = hb.LLM(preset="chat", gateway="portkey", cache=False, temperature=0)
Other gateway choices:
GatewayUse when
openaiYou want direct OpenAI-compatible SDK calls. This is the default.
anthropicYou want the official Anthropic SDK and native Messages payloads.
portkeyYou want routing, policy, observability, or hosted gateway behavior.
litellmYou already standardize provider routing through LiteLLM model names.
bifrostYou run a Bifrost-compatible OpenAI endpoint.
mockYou need offline deterministic tests.

4. Anthropic gateway decision

HeavenBase supports two Anthropic routes. Use the OpenAI SDK compatibility endpoint for quick Claude checks and comparisons with other OpenAI-compatible providers:
llm = hb.LLM(model="haiku", provider="anthropic", gateway="openai", cache=False)
print(llm.spec.materialize()["base_url"])
Use the native gateway when you want the official Anthropic Python SDK and the Claude Messages format:
llm = hb.LLM(model="haiku", provider="anthropic", gateway="anthropic", cache=False)
print(llm.spec.materialize()["base_url"])
Do not add a separate endpoint layer unless a real workload needs runtime endpoint policies. Current routing is preset -> model -> provider -> gateway, and the final URL comes from provider config, gateway config, or a runtime base_url=... override. The native gateway converts HeavenBase’s internal OpenAI-style message/tool history into Anthropic Messages payloads, maps canonical think and reasoning_effort controls to Anthropic thinking fields, then normalizes the response back to the usual include fields. Current proxy support fits that model:
  • Portkey supports OpenAI Chat Completions, OpenAI Responses, and Anthropic Messages gateway formats, and its Anthropic integration supports native /messages, the Anthropic SDK, prompt caching, extended thinking, files, and web search.
  • LiteLLM supports Anthropic through the anthropic/ provider route, chat-completions style calls, and /v1/messages passthrough.
  • Bifrost supports Anthropic through OpenAI-compatible chat/responses conversion to /v1/messages and through provider-compatible Anthropic SDK endpoints.
If a workload eventually needs runtime payload-family switching, add a gateway capability such as format="openai" or format="anthropic" before adding a new endpoint abstraction. Use materialize() when you need route evidence without live Claude spend:
runtime = hb.LLM(model="haiku", provider="anthropic", gateway="anthropic").spec.materialize()
print(runtime["base_url"])

5. GLM and Z.ai route checks

For GLM tool-call validation, keep native OpenAI JSON tools as the default. Start with glm-flash through OpenRouter via Portkey when you want a broadly compatible route.
runtime = hb.LLM(model="glm-flash", provider="zai", gateway="portkey").spec.materialize()
print(runtime["model"])

6. Proxy and VPN notes

Local live checks can use provider keys, base URL overrides, and simple proxy exports from ~/.bashrc, including HTTP_PROXY, HTTPS_PROXY, and NO_PROXY. Use those exports when VPN/TUN mode changes GLM, Anthropic, or OpenRouter reachability. You can also pass proxy settings directly:
llm = hb.LLM(
    preset="chat",
    http_proxy="http://127.0.0.1:7890",
    https_proxy="http://127.0.0.1:7890",
    no_proxy="localhost,127.0.0.1",
)

Further Exploration

Related resources:
  • First LLM - Configure keys and run the first chat
  • LLM overview - Presets, models, and endpoint decision