Documentation Index
Fetch the complete documentation index at: https://ahvn.top/llms.txt
Use this file to discover all available pages before exploring further.
以 sub-linear 时间构建 sub-Linear 应用。
在本工作坊中,你将从零构建 Sublinear:一个项目与议题(issue)跟踪器,单个 Agent 可通过 HeavenBase MCP 工具创建记录、更新工作、回答状态问题,并对议题做语义搜索。
整个应用只有一个 Python 文件:定义数据模型、删除并重建演示工作区以保证每次运行可重复、种子化允许的标签,然后让 Agent 通过 MCP 操作该工作区。
运行工作坊前请设置 OPENROUTER_API_KEY。默认情况下,Sublinear 通过 OpenRouter 同时使用聊天模型(deepseek/deepseek-v4-flash)与嵌入模型(openai/text-embedding-3-small),可能产生少量费用(< $0.01)。
1. 初始化
从 import、常量与后端配置开始。为简化起见,我们仅使用 SQLite + 内存后端,你也可以按需换成其他后端或继续扩展。
main 后端存储实体并支持 SQL 查询;vec 后端存储议题嵌入并支持向量搜索。
import heavenbase as hb
from heavenbase.utils import Any, LLMSession
WORKSPACE_ID = "sublinear"
TOOLKIT_NAME = "sublinear-mcp"
PRIORITY_RANKS = {
"urgent": 1,
"high": 2,
"medium": 3,
"low": 4,
"none": 5,
}
DATA_DIR = "./.data/sublinear/"
hb.utils.touch_dir(DATA_DIR)
BACKENDS = {
"main": {
"type": "sqlite",
"name": "main",
"database": f"file:{hb.utils.pj(DATA_DIR, 'sublinear-main.db')}",
},
"vec": {"type": "inmem", "name": "vec"},
}
接下来,我们先定义一些辅助函数,包括标签规范化、优先级排序、文本嵌入,以及议题文本构造。
工作坊使用 embed preset,默认解析为 OpenRouter 上 OpenAI 兼容的 openai/text-embedding-3-small 路由。你可以改用 embed-local,在 Ollama / LM Studio / vLLM 上本地运行小型嵌入模型。
def label_id(name: str) -> str:
"""Return a readable label identity from the label name."""
return (name or "").strip().lower().replace(" ", "-")
def rank_priority(priority: str = "medium") -> int:
"""Return sortable priority rank, where urgent is first."""
return PRIORITY_RANKS.get((priority or "medium").lower(), PRIORITY_RANKS["medium"])
def embed_text(text: str) -> list[float]:
"""Embed task text with the configured embedding preset."""
embedder = hb.LLM(preset="embed")
vector = embedder.embed(text or "")
return [float(item) for item in vector]
def issue_text(title: str, description: str, tags: list[str] | None = None) -> str:
"""Build the issue text used for vector indexing."""
return " | ".join([title or "", description or "", ", ".join(tags or [])])
2. 实体定义
从零构建应用时,先用类似 Pydantic 的声明式方式把数据模型想清楚。
在 Sublinear 的设计中,我们有项目(Project)、里程碑(Milestone)和议题(Issue)。项目用于组织工作并设定目标;里程碑是项目检查点,可用于进度跟踪;议题是具体工作项,可分配、打标签、附加 tags,并嵌入以支持语义搜索。我们还有 Label 实体存储允许的标签词汇,以及 View 实体存储已保存的过滤与展示偏好。
Label 实体:
class Label(hb.Entity):
"""Allowed project and issue label."""
object_id = hb.field(hb.Identifier).compute(label_id, inputs=["name"])
name = hb.field(hb.ShortText).desc("Label name")
color = hb.field(hb.ShortText).default("gray").desc("Label color for UI display")
description = hb.field(hb.LongText).default("").desc("Label description for agent reasoning")
使用 .compute 标注由任意 Python 函数计算得到的字段。该函数以初始化输入作为关键字参数,返回将写入字段的转换结果。
Milestone 实体:
class Milestone(hb.Entity):
"""Sublinear project milestone used for progress summaries."""
project_id = hb.field(hb.Identifier).desc("Owning project object_id")
name = hb.field(hb.ShortText).desc("Milestone name")
description = hb.field(hb.LongText).default("").desc("Milestone scope")
status = hb.field(hb.ShortText).default("planned").desc("planned, active, done, or skipped")
target_date = hb.field(hb.Date).optional().desc("Milestone target date")
sort_order = hb.field(hb.Integer).default(100).desc("Display order inside the project")
Project 实体:
class Project(hb.Entity):
"""Sublinear project containing goals and issue work."""
name = hb.field(hb.ShortText).desc("Project display name")
summary = hb.field(hb.LongText).default("").desc("Project overview")
owner = hb.field(hb.ShortText).default("unassigned").desc("Primary owner")
status = (
hb.field(hb.ShortText).default("active").desc("planned, active, paused, done, or archived")
)
priority = hb.field(hb.ShortText).default("medium").desc("urgent, high, medium, low, or none")
target_date = hb.field(hb.Date).optional().desc("Project target date")
labels = (
hb.field(hb.Array[hb.ShortText])
.default([])
.store(to="main", strategy=hb.SideTable)
.desc("Project label object_id values from Label rows")
)
goals = hb.field(hb.LongText).default("").desc("Plain-language project goals")
这里出现新概念 strategy,用于决定字段在后端上的存储方式。即使同一类型存储在同一后端,不同 strategy 也会形成不同的物理布局。例如 hb.SideTable 会把数组存成独立表,用外键关联主表,每个元素一行。
View 实体:
class View(hb.Entity):
"""Saved Sublinear filter and display configuration."""
name = hb.field(hb.ShortText).desc("View display name")
target_entity = hb.field(hb.ShortText).default("issue").desc("Entity this view queries")
owner = hb.field(hb.ShortText).default("team").desc("View owner")
filter_json = hb.field(hb.Json).default({}).desc("HeavenBase JSON query filter")
group_by = hb.field(hb.ShortText).default("status").desc("Preferred grouping field")
order_by = hb.field(hb.ShortText).default("priority_rank").desc("Preferred order field")
display = (
hb.field(hb.Array[hb.ShortText])
.default(["key", "title", "status", "priority"])
.desc("Shown fields")
)
shared = (
hb.field(hb.Boolean).default(True).desc("Whether the whole workspace should use the view")
)
作为对比,display 也是数组字段,但未指定 strategy,因此使用数组默认的 hb.InlineColumn strategy,即在主表中以内联列存储(具体为 TEXT、JSON/JSONB 或 ARRAY,取决于后端)。
Issue 实体(最复杂的一个):
class Issue(hb.Entity):
"""Sublinear issue with Linear-inspired properties and vector search."""
key = hb.field(hb.ShortText).desc("Human issue key such as S1")
project_id = hb.field(hb.Identifier).desc("Owning project object_id")
milestone_id = hb.field(hb.Identifier).optional().desc("Milestone object_id")
title = hb.field(hb.ShortText).desc("Issue title")
description = hb.field(hb.LongText).default("").desc("Issue details")
status = (
hb.field(hb.ShortText)
.default("todo")
.desc("backlog, todo, in-progress, blocked, done, or canceled")
)
priority = hb.field(hb.ShortText).default("medium").desc("urgent, high, medium, low, or none")
priority_rank = (
hb.field(hb.Integer)
.compute(rank_priority, inputs=["priority"])
.desc("Sortable priority rank")
)
assignee = hb.field(hb.ShortText).default("unassigned").desc("Current assignee")
estimate = hb.field(hb.Integer).default(0).desc("Small integer effort estimate")
labels = (
hb.field(hb.Array[hb.ShortText])
.default([])
.store(to="main", strategy=hb.SideTable)
.desc("Issue label object_id values from Label rows")
)
tags = (
hb.field(hb.Array[hb.ShortText])
.default([])
.store(to="main", strategy=hb.SideTable)
.desc("Free-form issue tags")
)
blocked_by = hb.field(hb.Array[hb.ShortText]).default([]).desc("Issue keys blocking this work")
due_date = hb.field(hb.Date).optional().desc("Optional due date")
created_at = hb.field(hb.Timestamp["s"]).optional().desc("UTC+0 epoch seconds")
updated_at = hb.field(hb.Timestamp["s"]).optional().desc("UTC+0 epoch seconds")
search_text = (
hb.field(hb.LongText)
.compute(issue_text, inputs=["title", "description", "tags"])
.desc("Text used to compute issue embedding")
)
emb = (
hb.field(hb.Vector[hb.LLM(preset="embed").dim])
.compute(embed_text, inputs=["search_text"])
.query_compute(embed_text)
.store(to="vec", strategy=hb.VectorIndex)
.desc("Issue embedding stored on the vector backend; semantic near accepts text queries")
)
这里还有新概念 query_compute,用于转换查询时的参数。例如写查询 emb.near("Hello") 时,参数 "Hello" 会先经 embed_text 处理成向量,再与最近的嵌入匹配。
3. 工作区构建
创建工作区、注册实体,并种子化标签词汇。Agent 查询这些行,并在项目与议题上存储标签 object_id。标签与 tag 数组路由到 SQLite 侧表,因此 labels.array_contains("debugging") 等分析过滤无需单独搜索后端即可工作。
def sublinear_workspace(*, reset: bool = False) -> hb.HeavenBase:
"""Open the Sublinear workspace and register entity classes."""
if reset:
hb.HeavenBase(WORKSPACE_ID, backends=BACKENDS).drop()
ws = hb.HeavenBase(WORKSPACE_ID, backends=BACKENDS)
for entity in (Label, Project, Milestone, Issue, View):
ws.register(entity)
return ws
ws = sublinear_workspace(reset=True)
搭建步骤会调用 sublinear_workspace(reset=True),每次运行前会删除演示工作区。跟做工作坊时请保留该重置;需要 Sublinear 数据持久化时再移除。
接下来,为工作区种子化一些标签词汇。
ws.upsert_many(
Label,
[
{"name": "research", "color": "blue", "description": "Research and design such as user survey and prototyping"},
{"name": "design", "color": "purple", "description": "Design-related tasks"},
{"name": "coding", "color": "yellow", "description": "Implementation and coding tasks"},
{"name": "debugging", "color": "red", "description": "Bug fixing and debugging tasks"},
{"name": "testing", "color": "orange", "description": "Testing and quality assurance tasks"},
{"name": "documentation", "color": "lightblue", "description": "Documentation and writing tasks"},
{"name": "low-effort", "color": "gray", "description": "Low-effort tasks under 4 units"},
{"name": "medium-effort", "color": "lightgray", "description": "Medium-effort tasks from 4 to 15 units"},
{"name": "high-effort", "color": "white", "description": "High-effort tasks over 15 units"},
],
)
4. Sublinear Agent
工作区构建完成后,可以直接通过 MCP 暴露给 Agent。
任何 Harness(Claude Code、Codex、Copilot 等)都可用来创建 Agent;这里我们用 HeavenBase 简单的 LLMSession。向会话添加 MCP 用 session.add_mcp(...);同时可用简洁的系统提示词引导 Agent。
def sublinear_system_prompt() -> str:
return """\
You are Sublinear Agent. Use HeavenBase MCP for all reads and writes.
- Inspect entities before writing. Store dates as YYYY-MM-DD.
- Create rows with upsert: omit object_id, provide name. Patch existing rows with set after querying the row.
- Patches should change only fields the user requested unless the user asks to reclassify labels or tags.
- For issue creates: project_id=Project.object_id, name=key, key=key. Query all Label rows, infer 1-3 labels from the full label set, store Label.object_id values, and add readable keywords to tags.
- If an issue mentions bugs, bug fixing, or debugging, include the debugging label.
- Never write priority_rank, search_text, or emb.
- Semantic search must query Issue with {"near":{"field":"emb","query":"text","top_k":5}}; never send vectors.
- If a tool call errors, retry with corrected arguments before answering. Answer only from successful tool results.
- Keep replies concise and plain ASCII. Avoid emoji and long tables.
""".strip()
def sublinear(question: str, llm: Any = None) -> str:
"""Perform a natural language question or command over the Sublinear workspace."""
session = LLMSession(
llm or hb.LLM(preset="chat", temperature=0.0, cache=False, max_tokens=4096)
)
session.add_mcp(
ws.to_mcp(name=TOOLKIT_NAME, profile="agent").to_fastmcp(), name="sublinear-mcp-client"
)
final = session.send(question, system=sublinear_system_prompt(), max_tool_turns=20)
return final.get("content") or ""
此时 sublinear 已经是一个 Agent 函数,可以回答问题并对 Sublinear 工作区执行命令。
每个会话中,Agent 只面对一个 MCP 表面:HeavenBase 工作区 toolkit。它可写入记录、查询结构化行,并运行语义 near 搜索,直到会话返回内容或达到 max_tool_turns。以下是 Agent 可用接口列表(与 HeavenBase MCP 页面相同):
| Tool | What it offers |
|---|
define_entity | Creates an entity definition from a JSON-compatible schema. |
list_entities | Lists the workspace entities the agent can inspect. |
describe_entity | Returns one entity’s fields, logical types, and routing plan. |
upsert | Inserts or replaces one row for one entity. |
get | Fetches one row by object ID. |
set | Patches one row and returns the updated row. |
count | Counts rows for one entity. |
query | Runs a JSON query with filters, projections, sorting, and limits. |
explain | Shows the route and handler plan for a query. |
其中 upsert 最适合创建与整行替换。对 Agent 编辑,set 更方便:Agent 查询现有行后只补丁修改过的字段。
语义搜索的关键是:模型发送的是文本,不是向量。Issue.emb 上的 query_compute(embed_text) 在 HeavenBase 将 near 操作路由到 vec 之前,会把查询字符串转为向量;随后 HeavenBase 从 SQLite 水合匹配的议题行。
语义搜索查询示例:
{
"near": {
"field": "emb",
"query": "launch readiness, debugging, and final polish",
"top_k": 5
},
"select": ["key", "title", "status", "labels", "score"]
}
5. 动手试试
在 docs 仓库根目录运行 python workshops/sublinear/sublinear_app.py 试用 Sublinear Agent。
以下脚本向 Sublinear 发送五条用户请求;每条请求使用新会话,但所有会话共享同一持久 HeavenBase 工作区。请求包括:初始化项目、添加议题、认领议题、统计 debugging 议题,以及对 “launch readiness, debugging, and final polish” 做语义搜索。
if __name__ == "__main__":
print(sublinear("""\
[USER: Mira]
Add a new "HB-GUI" project with high priority and ddl June 1st, 2027.
Project Goal: Create a user-friendly modern GUI for a specific app HB.
""".strip()))
print(sublinear("""\
[USER: Mira]
Add the following issues in order to the "HB-GUI" project:
1. S1: Survey GUI techstacks 2026, design at least 3 different plans (due 2026-06-17)
2. S2: Finalize plan and start implementing a prototype (due 2026-08-03)
3. S3: Test prototype with 5 users and iterate based on feedback, fix bugs (due 2026-10-03)
4. S4: GUI UX optimization and polish (due 2026-11-03)
5. S5: Research about parallelism and optimize implementation (due 2026-11-03)
6. S6: Final testing and debugging (due 2026-12-03)
7. S7: Prepare launch materials (docs, tutorial, etc.) and launch (due June 1st, 2027)
""".strip()))
print(sublinear("""\
[USER: Bob]
Self-assign S1, with estimating effort 3 units.
""".strip()))
print(sublinear("""\
[USER: Alice]
How many debugging tasks are currently planned, and which ones are they?
""".strip()))
print(sublinear("""\
[USER: Carol]
Use semantic search over issue embeddings for "launch readiness, debugging, and final polish".
Which planned issues are the closest matches, and why? Report only the relevant ones.
""".strip()))
5.1. 预期行为
五次调用刻意使用独立的 LLMSession 实例,表明 Agent 不依赖聊天记忆;每轮通过 MCP 查询同一 HeavenBase 工作区来恢复上下文。
运行过程中,Agent 应:
- 用
upsert 创建 HB-GUI 项目,省略 object_id,由 HeavenBase 对项目 name 做哈希。
- 查询项目与标签词汇,再创建
S1 至 S7 议题,含截止日期、标签、tags 与计算得到的嵌入。
- 用
set 补丁 S1,仅改 assignee 与 estimate,计算字段保持一致。
- 根据已存储行回答 debugging 数量问题,通常通过过滤
Issue.labels 中的 debugging 标签 ID。JSON 规范可用 array_contains;HeavenBase 也会将数组字段上的 contains 规范为同一操作。
- 对
Issue.emb 做文本 near 查询回答语义搜索问题,返回最接近的相关议题及分数。
成功运行会创建 HB-GUI 项目、添加七条带推断标签 ID 的议题、用 set 将 S1 分配给 Bob、回答 debugging 工作包含 S3 与 S6,并对向量字段使用 near.query 文本,对 launch/debugging/polish 相关议题(如 S6、S7、S2)排序。
5.2. 示例回复
模型措辞可能不同,但成功运行应达到以下具体结果:
| 提示 | 示例回复 |
|---|
| 添加 HB-GUI 项目 | Created HB-GUI as a high-priority active project targeting 2027-06-01. |
| 添加议题 S1-S7 | Added S1-S7 to HB-GUI with due dates, inferred label IDs such as research, design, coding, debugging, testing, and documentation, plus readable tags. |
| 认领 S1 | S1 is now assigned to Bob with estimate 3. |
| 统计 debugging 工作 | 2 debugging tasks are planned: S3 and S6. |
| 语义搜索 | Closest matches: S6 final testing/debugging, S4 UX polish, and S3 user-test bug fixes. |
重点不在于回复的逐字措辞。该应用展示单个 Agent 可操作结构化记录、计算字段、将向量路由到向量后端,并查询同一工作区以得到分析答案。
进一步探索