数据库集成 - HeavenBase

HeavenBase 的 SQL 系列后端现在使用原生 heavenbase.utils.db.Database 门面来管理连接生命周期、schema 检查、表变更、注释、统计、采样和执行结果。旧工具栈只作为参考材料，不再是主包包依赖。

支持的 provider

Database 从 src/heavenbase/resources/configs/default.yaml 读取提供商 (Provider) 配置；bootstrap 配置只在配置仓库初始化前使用。当前预设包括 SQLite、DuckDB、PostgreSQL、MySQL 兼容引擎、SQL Server、Oracle、Trino，以及使用 MySQL wire 协议的 StarRocks。 sqlalchemy>=2.0 是 HeavenBase 主依赖。非 SQLite provider 仍需要安装对应驱动，例如 psycopg2、pymysql、pymssql、oracledb 或 trino。

执行行为

配置 bootstrap 会先读取文件配置，然后通过 Database 打开配置仓库。
Engine 按解析后的 database spec 共享，并显式记录 disposed 状态。相同 provider/database/pool 参数的两个 Database 实例会使用同一个 SQLAlchemy engine 和 pool。
首次访问 engine 时会 best-effort autocreate。若 superuser 或建库凭据不可用，HeavenBase 会继续尝试直连，以便已有数据库仍能使用。
显式删除使用 db.drop_database(force=True)。SQLite/DuckDB 删除文件，PostgreSQL/MySQL/MSSQL/StarRocks 删除 database，Oracle 删除目标 user/schema 但不删除 PDB service。Trino catalog 由 connector 管理，因此不会自动创建或删除 catalog。
readonly 是显式语义。默认是 readonly=False；传 readonly=True 会拒绝不能确信为只读的语句。只有传 readonly=None 时才启用保守自动推断，用于决定 commit/rollback。
原始 SQL 会在执行前规范化占位符。支持 :name、?、%s、%(name)s、$name 和 $1，参数可用 dict、位置 tuple/list、list-of-dicts batch，以及驱动支持时的 list-of-tuples batch。裸 ? 只会在传入位置参数时重写，因此 PostgreSQL JSON 操作符 payload ? 'key'、?| 和 ?& 会保持有效；?::int 这类 PostgreSQL cast 也会在占位符重写后保留。
safe=True 会把执行错误转成 SQLResponse(ok=False)。在活动事务里，safe failure 会把事务标记为失败，context 退出时回滚，除非调用方显式 rollback() 后继续处理。execute_many(..., safe=True) 默认在第一条失败语句处停止并回滚；autocommit=True 表示显式选择逐条提交的 best-effort 行为。
SQL healing 仍是未来工作，并单独跟踪。当前 API 返回或抛出原始数据库错误，不调用 LLM 修复路径。

工具接口

Database 提供 schema 列表、表/视图/列检查、建表和表变更、方言支持时的注释、精确分位数、字符串长度统计、无需假设 id 列的确定性采样，以及 SQLResponse 导出 helper：dict/list、pandas、NumPy、PyArrow 和紧凑表格显示。

import heavenbase as hb

db = hb.Database(provider="sqlite", database="workspace.db")
db.execute("create table if not exists items (id text primary key, name text)")
db.execute("insert into items (id, name) values (:id, :name)", {"id": "a", "name": "alpha"})
db.execute("insert into items (id, name) values (?, ?)", ("b", "beta"))

db.tables()
db.columns("items")
db.n_rows("items")

rows = db.execute("select id, name from items", readonly=True)
rows.to_dicts(columns=["id"])
rows.table_display()

只有在确实希望 HeavenBase 推断语句是否只读时才使用 readonly=None。未知语句会按可写语句处理。

db.execute("with input(id) as (select 'c') select id from input", readonly=True)
db.execute("select count(*) from items", readonly=None)

数据库生命周期 helper 是显式的：

db = hb.Database(provider="postgres", database="workspace_demo")
db.execute("select 1")  # first engine access may autocreate the database
db.drop_database(force=True)

唯一支持的 Database API 是 canonical 命名模型。使用 tables()、columns()、pks()、fks()、n_rows()、sample()、create_table()、add_table_col()、clear_table() 和 drop_table()；不保留旧兼容别名，例如 db_tabs、tab_cols、row_sample 和 create_tab。如果包装层刻意保持得比 SQLAlchemy 小，可以直接使用公开 engine 或 ORM 执行路径：

from sqlalchemy import select

db.engine
db.orm_execute(select(my_table.c.id), readonly=True)

SQLResponse 默认校验投影列。只有在希望缺失列渲染为 None 的宽松导出/显示代码中，才传 check=False。

rows.to_dicts(columns=["missing"], check=False)
hb.table_display(rows, columns=["name"])

​支持的 provider

​执行行为

​工具接口

支持的 provider

执行行为

工具接口