Skip to main content

Documentation Index

Fetch the complete documentation index at: https://ahvn.top/llms.txt

Use this file to discover all available pages before exploring further.

hb.LLM is the HeavenBase Python client for chat, streaming, embeddings, image generation, mocks, and gateway routing. It resolves a preset, model, provider, and gateway from the shared heavenbase.llm config, then materializes the request format selected by the gateway. The default is preset="system", default_provider="openrouter", and gateway="openai".
export OPENROUTER_API_KEY="..."
import heavenbase as hb

llm = hb.LLM()
text = llm.chat("Reply with exactly: hb-ok")

Resolution model

  • preset: a named shortcut such as system, chat, reason, coder, embed, imagen, mock, or custom.
  • model: a canonical model key or alias such as ds-flash, sonnet, gpt, or gpt-image-mini.
  • provider: where the model is served. Normal presets inherit heavenbase.llm.default_provider; explicit provider= and preset-level provider pins override it.
  • gateway: how the request is transported. The default openai gateway uses the OpenAI Python SDK against OpenAI-compatible endpoints; anthropic uses the official Anthropic SDK and native Messages payloads.
hb.LLM()                         # preset="system"
hb.LLM(model="ds-flash")
hb.LLM(model="ds-flash", provider="deepseek")
hb.LLM(preset="reason")
hb.LLM(preset="mock")
Unknown keyword arguments become provider request defaults:
llm = hb.LLM(model="ds-flash", max_tokens=256, temperature=0)

Presets

Presets use persistable model aliases so user configs stay compact and readable. Most production presets do not pin a provider; changing heavenbase.llm.default_provider switches them together. Local and special presets such as embed-local, mock, and custom keep their own provider because they require a specific runtime.
PresetModel aliasThinking defaultDescription
systemds-flashdisabledDefault lightweight system LLM for short orchestration calls.
tinygemmadisabledTiny chat model for low-latency offline work.
chatds-flashdisabledGeneral chat preset for fast non-thinking answers.
chat-prods-prodisabledStronger chat preset that still avoids reasoning by default.
reasonds-proenabledReasoning preset for harder tasks that should expose thinking when supported.
reason-progptenabledHigher-capability reasoning preset backed by the default GPT alias.
workerds-flashdisabledBackground worker preset for deterministic non-thinking utility calls.
codersonnetenabledCoding preset with thinking enabled for multi-step implementation work.
coder-proopusenabledHighest-end coding preset for deep implementation and review work.
embedgpt-embedding-smalln/aDefault embedding preset.
embed-localembeddinggemman/aLocal embedding preset.
imagengpt-image-minin/aDefault image-generation preset.
mockmockofflineNon-LLM deterministic mock preset for tests and demos.
customcustomruntime suppliedRuntime-supplied OpenAI-compatible provider preset.
Preset thinking defaults use the canonical think option. HeavenBase applies gateway-level control through extra_body.reasoning for both think=True and think=False on the OpenAI-compatible gateways (openai, portkey, bifrost, and litellm). The anthropic gateway maps think=True to Claude Messages adaptive thinking with summarized display, maps reasoning_effort to Anthropic effort, and normalizes native thinking blocks back to the think include field. Provider-specific local-server options can still be passed explicitly with extra_body.

Gateway and endpoint decision

Endpoint selection stays inside provider and gateway config. The resolution path is preset -> model -> provider -> gateway; the final URL comes from provider.base_url, gateway.base_url, or a runtime base_url=... override. Do not add a separate endpoint layer unless a real workload needs runtime endpoint policies. Use provider="anthropic", gateway="openai" for quick Claude compatibility checks, provider="anthropic", gateway="anthropic" for the native Claude Messages format, and gateway="portkey" when you want routing, observability, or policy controls. For GLM tool-call validation, keep native OpenAI JSON tools as the default. Start with glm-flash through OpenRouter via Portkey when you want a broadly compatible route. Local live checks can use simple provider-key, base URL, and proxy exports from ~/.bashrc, including HTTP_PROXY, HTTPS_PROXY, and NO_PROXY. Use those exports when VPN/TUN mode changes GLM, Anthropic, or OpenRouter reachability.

Curated model catalog

Online bundled models include OpenRouter identifiers and direct-provider identifiers where available. The root default_provider chooses which identifier is used unless a call or preset pins another provider. Local-only entries, such as embeddinggemma, list only local providers.
Canonical modelAliases
deepseek-v4-flashds-flash, deepseek-flash, deepseek-chat
deepseek-reasonerds-flash-thinking, deepseek-flash-thinking
deepseek-v4-prods, ds-pro, dsv4, dsv4-pro, deepseek, deepseek-v4, deepseek-pro
gpt-5.4-nanogpt-nano, 5.4-nano
gpt-5.5gpt, 5.5
gpt-5.5-progpt-pro, 5.5-pro
claude-haiku-4-5haiku, haiku-4.5
claude-sonnet-4-6sonnet, sonnet-4.6
claude-opus-4-7opus, opus-4.7
gemini-3.1-flash-litegemini-flash-lite, gemini-lite
gemini-3-flash-previewgemini-flash
gemini-3.1-pro-previewgemini-pro
kimi-k2.6kimi, k2.6
glm-5.1glm
glm-4.7-flashglm-flash, glm-4.7
gemma-4-26b-a4b-itgemma4, gemma4-26b, gemma4-26b-a4b, gemma4-26b-a4b-it, gemma, gemma-26b, gemma-26b-a4b, gemma-26b-a4b-it
qwen3.6-flashqwen3.6, qwen3.6-flash, qwen3.6-35b, qwen3.6-35b-a3b, qwen, qwen-flash, qwen-35b, qwen-35b-a3b
embeddinggemmaembeddinggemma-300m
text-embedding-3-smallgpt-embedding, gpt-embedding-small, text-embedding-small
embed-v4.0cohere, cohere-embedding, cohere-embedding-v4, embed-v4
voyage-4-litevoyage, voyage-lite
gpt-5-image-minigpt-image-mini, image-mini
gpt-5.4-image-2gpt-image, gpt-image-2, image-2
mock and custom are utility model entries for offline tests and runtime-supplied OpenAI-compatible providers. Built-in provider configs include OpenRouter, OpenAI, Anthropic, Gemini, Grok, DeepSeek, Moonshot, Z.ai, Minimax, Cohere, Voyage, DashScope, Ollama, LM Studio, vLLM, mock, and custom. embed-v4.0 and voyage-4-lite are embedding-only catalog entries served by their own providers (not OpenRouter).