The model proposes. HeavenBase can execute — or just hand you the JSON and step back.
Tool use is where chat stops being text-only. Pass tools to chat, stream, or LLMSession, and HeavenBase normalizes schemas, runs executable callables, and projects the full turn through include.
1. What You Can Pass
The tools list accepts:
- OpenAI-compatible function schema dictionaries.
- Plain Python callables with type annotations and docstrings.
- HeavenBase
Tool objects.
- HeavenBase
Toolkit objects, including MCP servers imported with Toolkit.from_fastmcp(...).
Schema-only tools are sent to the provider and returned as tool_calls. Executable tools run automatically, and their role="tool" result messages appear in delta and messages.
Tool execution errors serialize as structured tool-result content:
{"ok": false, "error": {"type": "ValueError", "message": "bad input"}}
Use schema-only tools when another process will execute the calls:
import heavenbase as hb
llm = hb.LLM(preset="chat")
tools = [
{
"type": "function",
"function": {
"name": "lookup_user",
"description": "Look up a user by id",
"parameters": {
"type": "object",
"properties": {"user_id": {"type": "string"}},
"required": ["user_id"],
},
},
}
]
calls = llm.chat("Find user 42", tools=tools, include="tool_calls")
Tool calls are collected from streaming deltas and returned in the standard OpenAI tool_calls shape.
Use functions or Toolkits when HeavenBase should run the tools for the model:
import heavenbase as hb
def add(left: int, right: int) -> int:
"""Add two numbers."""
return left + right
llm = hb.LLM(preset="chat")
result = llm.chat(
"Use the add tool to calculate 42 + 73.",
tools=[add],
include=["text", "delta"],
reduce=False,
)
print(result["text"])
print([message["role"] for message in result["delta"]])
When the model calls add, HeavenBase appends the assistant tool-call message, a tool-result message, then asks the model for the final answer. A typical delta looks like:
[
{"role": "assistant", "tool_calls": [...]},
{"role": "tool", "tool_call_id": "...", "name": "add", "content": "115"},
{"role": "assistant", "content": "42 + 73 = 115."},
]
Control how many assistant iterations run with max_tool_turns (default 8 in Python, mapped from --max-steps in the CLI):
llm.chat("Plan a three-step fix.", tools=[...], max_tool_turns=12)
Async callables require achat instead of chat.
MCP servers become normal tools by importing them as a HeavenBase Toolkit:
import heavenbase as hb
workspace = hb.HeavenBase("demo", backends={"main": {"type": "inmem"}})
server_toolkit = workspace.to_mcp(name="demo-heavenbase")
mcp_tools = hb.Toolkit.from_fastmcp(server_toolkit.to_fastmcp())
answer = hb.LLM(preset="chat").chat(
"List entities in the workspace.",
tools=[mcp_tools],
)
The same pattern works with HTTP/SSE MCP URLs accepted by FastMCP clients.
The CLI exposes the same execution path:
hb llm chat --mcp quickstart.math-tools:-1 "Use the math tools for 42 * 73."
hb llm chat --mcp quickstart.math-tools:-1 --max-steps 20 "Use the math tools for 42 * 73."
hb llm session --mcp http://127.0.0.1:7001/mcp
Inside a session, add another MCP source with /mcp SOURCE. hb llm chat --max-steps caps assistant iterations in a tool loop; the default is 20.
5. Structured Output
Pass OpenAI-compatible response_format arguments directly:
import heavenbase as hb
llm = hb.LLM(preset="chat")
data = llm.chat(
"Return JSON with keys name and score.",
response_format={"type": "json_object"},
include="structured",
)
Most providers stream structured output through the normal path. When a model is known to be unreliable for streaming structured JSON, its model defaults can set structured_stream: false.
Force a non-streaming structured call when you need a single reliable payload:
data = llm.chat(
"Return JSON with key ok=true.",
response_format={"type": "json_object"},
include="structured",
enforce_non_stream_structured=True,
)
For stream(..., enforce_non_stream_structured=True), HeavenBase performs one non-streaming request and yields a single projected chunk.
Some providers return malformed JSON in tool_calls[].function.arguments. HeavenBase can repair common mistakes — fenced code blocks, unbalanced braces, missing required fields — before execution.
Repair is off by default. Enable it globally:
hb cfg set heavenbase.llm.tool_call_repair.enabled true
Or per instance:
llm = hb.LLM(preset="chat", tool_call_repair={"enabled": True})
result = llm.chat("Use the tool.", tools=[add], repair_tool_calls=True)
Set strict: true in config to raise when repair fails instead of returning the original arguments. See Advanced LLM for repair behavior details.
Further Exploration
Related resources:
- Sessions — multi-turn tool use with
LLMSession and hb llm session.
- LLM Chat — CLI
--mcp for single-turn agentic calls.
- First MCP — attach a math Toolkit from the quickstart.