Documentation Index
Fetch the complete documentation index at: https://ahvn.top/llms.txt
Use this file to discover all available pages before exploring further.
chat is the main text entry point. LLM is callable, so llm("hello") and llm.chat("hello") are equivalent.
system=...:
Message inputs
chat and stream accept a single string, one OpenAI-style message dictionary, or a list of messages.
model_dump(), to_dict(), dict(), json(), or role and content attributes.
For multimodal models, pass image inputs with images=. HeavenBase normalizes each input to LLMImage and appends OpenAI-compatible image_url content parts to the last user message.
images= accepts LLMImage, URLs, data URLs, base64 strings, local paths, bytes-like objects, binary file objects, provider-style dictionaries, Pillow images, and numpy-compatible ndarrays.
Streaming
Usestream when you want deltas as they arrive:
chat uses the same streaming path internally and gathers the final response. This keeps regular chat, reasoning streams, usage accounting, structured outputs, and tool calls on one response pipeline.
Include projection
Theinclude argument selects response fields:
text: final assistant text.think: reasoning or thinking content, when a provider streams it separately.content: thinking plus visible answer, with<think>tags around reasoning.message: OpenAI-format assistant response dictionary withrole,content, and optionaltool_calls; it is not the full history.delta: new OpenAI-format messages produced by this inference call. It starts withmessageand will include tool-result messages after it once tool execution is wired in.messages: full conversation history: normalized input messages plusdelta.tool_calls: normalized OpenAItool_callsfrom the assistant response.usage: provider usage counters for this call. Common keys areprompt_tokens,completion_tokens, andtotal_tokens; streamed usage chunks are merged by summing numeric counters and keeping the first non-numeric value per key.raw: raw provider payloads.elapsed: request elapsed seconds.created_at: local response creation timestamp.structured: parsed structured output.
stream includes delta or messages, progressive content still arrives as normal and HeavenBase emits one final metadata chunk with empty text/think and the completed message delta.
Reasoning presets enable the canonical think option. You can override the default per call, and HeavenBase will convert it to gateway-level extra_body.reasoning for OpenAI-compatible gateways:

