Skip to main content
The LLM remembers nothing. That’s a feature — until you want it to.
hb.LLM stays stateless on purpose. Conversation history is just a message list you pass to chat or stream. When you need multi-turn chat without rebuilding that list every time, LLMSession holds the history and optional tools for you.

1. Manual History Without a Session

You can manage history yourself when you only need one or two turns:
import heavenbase as hb

messages = [
    {"role": "user", "content": "Remember the code word is hb-ok."},
    {"role": "assistant", "content": "Remembered."},
    {"role": "user", "content": "What is the code word?"},
]

answer = hb.LLM(preset="chat").chat(messages)
This pattern is enough for pipelines that already carry state elsewhere.

2. Session Tools

LLMSession can hold session tools and use them on every turn. Tools follow the same tools=[...] contract as LLM.chat: Python callables, Tool, Toolkit, and schema dictionaries are accepted.
import heavenbase as hb
from heavenbase.utils import LLMSession


def add(left: int, right: int) -> int:
    """Add two numbers."""
    return left + right


session = LLMSession(hb.LLM(preset="chat"))
session.add_tool(add)

answer = session.send("Use the add tool for 2 + 3.")
print(answer["content"])
External MCP servers import as session Toolkits:
session = LLMSession(hb.LLM(preset="chat"))
session.add_mcp("http://127.0.0.1:7001/mcp")
session.send("What tools are available?")
The session stores the assistant tool call, the role="tool" result, and the final assistant response in messages. See Tool Use for the full executable-tool contract.

3. Session Lifecycle

LLMSession exposes small helpers for interactive workflows:
session.append_assistant("Draft saved.")   # append without a provider call
session.back()                             # drop the latest user turn and everything after it
session.clear()                            # reset message history
session.list_tools()                       # active runtime tool names
Persist only messages — runtime MCP Toolkits are not safe to serialize:
session.save("./session.json")
restored = LLMSession.load("./session.json", llm=hb.LLM(preset="chat"))
to_dict() and from_dict() return JSON-safe payloads with the same message-only contract.

4. CLI Interactive Sessions

Start a multi-turn CLI session with the chat preset:
hb llm session
Type messages at the >>> prompt. Slash commands:
CommandAction
/helpShow available commands
/save <path>Save message history to JSON
/load <path>Load message history from JSON
/clearClear the session
/regen <seed>Regenerate the last response with an optional seed
/backRemove the latest user turn
/toolsList attached tools
/mcp SOURCEAttach an MCP Toolkit mid-session
/bye, /exitQuit
Attach MCP tools at startup with repeated --mcp values:
hb llm session --mcp http://127.0.0.1:7001/mcp
hb llm session --mcp quickstart.math-tools:-1
hb llm session --mcp math-tools
Inside a running session, add more tools with /mcp:
>>> /mcp quickstart.math-tools:-1
>>> What's 42 * 73?
The canonical Toolkit ref form is namespace.toolkit:version. Negative versions count back from latest: -1 is latest, -2 is second-most latest. When a provider emits separate thinking content, the CLI session prints it inside <think> and </think> before the visible assistant text. Tool iterations print step-by-step with STEPS: 001 / 020, followed by tool calls and tool results.

5. Inspect Resolved State

Use spec when you need to see how a client resolved:
import heavenbase as hb

llm = hb.LLM(model="ds-flash", provider="deepseek")

print(llm.spec.to_dict())
print(llm.spec.materialize())
spec.to_dict() omits secrets. spec.to_dict(secrets=True) includes the materialized resolved dictionary and should only be used in trusted debugging contexts. Resolved specs also produce stable hash keys for deduplication and cache lookup:
spec_key = llm.spec.hash_key()
litellm_key = llm.spec.hash_key("litellm")
client_key = llm.spec.client_key()
client_key() includes only gateway client construction fields, so duplicate LLM instances can reuse the same in-memory OpenAI-compatible SDK client.

Further Exploration

Related resources:
  • Tool Use — schema-only and executable tools, MCP Toolkits, and structured output.
  • LLM Chat — message inputs, streaming, and include projection.
  • First LLM — CLI session tour and MCP attachment from the quickstart.