实体 (Entity) - HeavenBase

实体先说明数据是什么，再谈数据存在哪里。

1. 定义逻辑形态

实体是继承 hb.Entity 的 Python 类。需要元数据、默认值、存储放置或 compute Hook 时用 hb.field(...)；最短纯字段形式用类型注解。

import heavenbase as hb


class Document(hb.Entity):
    title = hb.field(hb.ShortText).desc("Display title")
    body = hb.field(hb.LongText)
    tags = hb.field(hb.Array[hb.ShortText]).default([])
    embedding = hb.field(hb.Vector[2])


print(Document.schema().entity_id)

上面的类创建名为 document 的逻辑实体。若未自定义 object_id，HeavenBase 还会注入 object_id 与 name。

2. 选择逻辑类型

逻辑类型描述含义；后端决定这些含义如何物理存储。

Type	Use it for
`Identifier`	稳定 ID 与 slug。
`ShortText`, `MediumText`, `LongText`	标签、描述与完整文档。
`Integer`, `Float`, `Boolean`	标量值。
`Categorical([...])`	限定在已知选项内的字符串或整数值。
`Timestamp(unit="ms")`, `Date()`	时刻与日历日期。
`Array[...]`	带可选元素类型的列表。
`Vector[dim]`	嵌入与其他数值向量。
`Json`	JSON 兼容对象。
`HyperG`	可重复的关系式记录。
`Artifact`	二进制载荷。

HeavenBase 尚未提供独立的 Datetime 或 Interval 逻辑类型。时刻用 Timestamp，日历日期用 Date；时长可用带单位描述的数值字段。

3. 理解身份

每个实体行恰好有一个面向用户的 object_id。若省略，HeavenBase 用行的 name 派生确定性 ID。

ws = hb.HeavenBase("core-entities", preset="debug")
ws.register(Document)

first_id = ws.upsert(
    Document,
    {
        "name": "Agent guide",
        "title": "Agent guide",
        "body": "Use Catalog for objects and MetaSchema for structure.",
        "tags": ["docs"],
        "embedding": [1.0, 0.0],
    },
)

second_id = ws.upsert(
    Document,
    {
        "name": "Agent guide",
        "title": "Agent guide",
        "body": "Updated body.",
        "tags": ["docs"],
        "embedding": [1.0, 0.0],
    },
)

print(first_id == second_id)

仅当自然键应驱动身份时，才自定义 object_id。

from heavenbase.utils import hash_id


def stock_id(sku: str, warehouse: str) -> str:
    return hash_id("StockItem", sku, warehouse)


class StockItem(hb.Entity):
    object_id = hb.field(hb.Identifier).compute(stock_id, inputs=["sku", "warehouse"])
    sku = hb.field(hb.ShortText)
    warehouse = hb.field(hb.ShortText)
    quantity = hb.field(hb.Integer).default(0)


ws.register(StockItem)
stock_id_value = ws.upsert(StockItem, {"sku": "SKU-001", "warehouse": "east"})
print(ws.get(stock_id_value, entity=StockItem)["quantity"])

4. 添加 Compute Hook

计算字段在写入行时运行。Query-compute Hook 在查询值下发前需要规范化时运行。常见场景是为向量字段接受文本查询。

def embed_text(text: str) -> list[float]:
    return [1.0, 0.0] if "agent" in text.lower() else [0.0, 1.0]


class Issue(hb.Entity):
    title = hb.field(hb.ShortText)
    summary = hb.field(hb.LongText)
    emb = (
        hb.field(hb.Vector[2])
        .compute(embed_text, inputs=["summary"])
        .query_compute(embed_text)
    )


ws.register(Issue)
ws.upsert(Issue, {"object_id": "i1", "title": "Agent memory", "summary": "Design agent memory around search."})

frame = ws.query(Issue).near(Issue.emb, "agent search", top_k=1).select("title", "score").execute()
print(frame.rows()[0]["title"])

5. 从 JSON 编译

Agent 与外部客户端可用 JSON 兼容的 schema 字典定义实体。

Product = hb.Entity.from_schema(
    {
        "entity_id": "product",
        "name": "Product",
        "fields": {
            "name": "ShortText",
            "price": "Float",
            "status": {
                "type": {
                    "type": "categorical",
                    "values": ["draft", "active"],
                },
            },
            "created_at": {"type": {"type": "timestamp", "unit": "ms"}},
            "available_on": "Date",
        },
    }
)

print(Product.schema().entity_id)

生成的类行为与普通 hb.Entity 子类相同，可传给 ws.register(...)、ws.upsert(...) 与 ws.query(...)。

进一步探索

相关资源：

工作区 - 在应用边界注册实体
路由 - 将字段存到选定后端
查询 - 过滤、搜索与检查实体行
目录 - 写入后发现具体对象

​1. 定义逻辑形态

​2. 选择逻辑类型

​3. 理解身份

​4. 添加 Compute Hook

​5. 从 JSON 编译

​进一步探索

1. 定义逻辑形态

2. 选择逻辑类型

3. 理解身份

4. 添加 Compute Hook

5. 从 JSON 编译

进一步探索