Defaults get you started. This page is for when you need to see the wiring.
hb.LLM(preset="chat") and the default OpenAI-compatible gateway. When you route through Portkey, cache deterministic calls, export SDK clients, or generate images, the same resolution model still applies — preset, model, provider, gateway.
1. Gateways in Depth
The default gateway isopenai. It uses the OpenAI Python SDK against the provider’s OpenAI-compatible endpoint, so no Portkey, LiteLLM, Bifrost, or native Anthropic SDK process is required.
| Gateway | Use when |
|---|---|
openai | You want the OpenAI Python SDK against an OpenAI-compatible endpoint. This is the default. |
anthropic | You want the official Anthropic Python SDK and native Messages payloads. |
portkey | You want gateway-level routing, observability, caching, or policy controls. |
bifrost | You run a Bifrost-compatible gateway and want provider-prefixed model routing. |
litellm | You already standardize provider routing through LiteLLM’s Python gateway. |
mock | You need deterministic offline behavior for tests and demos. |
litellmsends provider-prefixed model IDs such asdeepseek/deepseek-v4-flash.bifrostsends provider-prefixed model IDs and usesBIFROST_BASE_URL, defaulting tohttp://localhost:8080/v1.mockstays offline and uses the built-in mock adapter.
base_url config unless a workload needs payload-family policy independent from those layers.
Temporary upstream limitations are raised explicitly:
gateway="portkey"withprovider="openrouter"for embeddings is blocked until Portkey Gateway support lands.- Bifrost image generation is blocked while upstream image support is unresolved.
- The native
anthropicgateway raises for embeddings and image generation because Claude Messages does not provide those operations.
hb.LLM falls back to the openai gateway.
For step-by-step gateway setup, see First LLM §5.2 and LLM providers.
2. Response Caching
HeavenBase caches normalized LLM responses in a dedicatedllm-cache workspace backed by SQLite entities. Three namespaces exist: text for chat completions, embedding for vectors, and image for generated images.
Caching is enabled by default (heavenbase.llm.cache.enabled: true). Disable it when you need a fresh provider call:
cache=False, cache=True, or a custom cache config dict. Chat cache skips tool loops automatically — executable tools disable text cache for that call.
The default policy is deterministic. Text and image cache writes require deterministic request args: temperature=0 (or unset with a fixed seed), default top_p/top_k, and for images a set seed. Stochastic calls without a seed bypass cache reads and writes.
Configure namespaces under heavenbase.llm.cache.namespaces:
3. Client Reuse and SDK Exports
Every resolvedLLMSpec can produce deterministic hash keys:
hash_key() includes the resolved model, provider, gateway mode, request defaults, and materialized resolved values. client_key() includes only gateway client construction fields: gateway, API key, base URL, headers, timeout, and retries.
SDK adapters for openai, portkey, bifrost, and anthropic keep an in-memory cache keyed by client_key(), so duplicated LLM instances reuse the same SDK client.
An OpenAI-compatible LLM instance can export raw SDK clients when an external library owns the call loop:
to_client(), to_aclient(), and to_args() work for OpenAI-compatible gateways: openai, portkey, or bifrost. They raise ValueError for litellm, anthropic, and mock.
For the native Anthropic gateway:
4. Image Generation
Useimagen for image generation responses:
gpt-5-image-mini and gpt-5.4-image-2:
LLMImage objects when possible; raw provider payloads remain available through include="raw".
imagen accepts the same images= input formats as chat for reference images:
5. LLMImage API
LLMImage is the shared image type for chat inputs, reference images, and generation output.
Factory methods normalize common sources:
LLMImage values fetch lazily only when converted to bytes, base64, a data URL, or saved. The fetch timeout is configured by heavenbase.llm.image_url_timeout.
6. Tool-Call Repair
LLMToolCallRepair fixes malformed OpenAI-style tool-call argument strings before execution. It strips markdown fences, balances JSON brackets, fills missing required schema fields, and re-serializes compact JSON.
Global config lives under heavenbase.llm.tool_call_repair:
tool_call_repair={...} to hb.LLM(...), or repair_tool_calls=True on a single chat call. When repair is enabled on the instance, repair_tool_calls=True is applied by default.
With strict: true, repair raises ValueError when arguments cannot be parsed instead of returning the original string.
7. Custom OpenAI-Compatible Providers
Use thecustom preset for a provider that speaks the OpenAI API but is not in the bundled model catalog:
base_url and a concrete model.
8. Async APIs
Every sync method has an async counterpart:achat — sync chat raises when a tool callable is async.

