目录 / 文档-技术白皮书 / 43-EFT.WP.Data.DatasetCards v1.0
I. 章节目的与范围
:定义接口原型、请求/响应模式、错误码、鉴权与幂等、速率限制与版本协商;确保与数据契约、计量校核、引用锚点与导出清单一致,服务于发布前阻断检查与门户自动化。键名使用 snake_case,跨卷引用采用“卷名+版本+锚点”。 验证 API与规范性实现绑定提供II. 术语与依赖
- 术语来源:遵循《EFT.WP.Core.Terms v1.0》,本章仅增量定义 API 相关术语与字段。
- 依赖卷:
- 数据契约/导出与锚点口径:《EFT.WP.Core.DataSpec v1.0》。
- 计量/量纲与不确定度:《EFT.WP.Core.Metrology v1.0》。
- 路径依赖/到达时等价表达:《EFT.WP.Core.Equations v1.1》。
III. 接口总览(规范性)
services:
datasetcards.v1:
- POST /api/v1/validate_card # 结构与一致性校验(阻断)
- POST /api/v1/lint_card # Lint 规则与文案护栏
- POST /api/v1/check_units # 量纲/单位一致性(Metrology)
- POST /api/v1/verify_references # 引用锚点格式与可达性
- POST /api/v1/split_freeze_check # 切分冻结与比例一致
- POST /api/v1/leakage_audit # 跨 split 泄漏检查
- POST /api/v1/coverage_report # 覆盖率与区间统计
- POST /api/v1/uncertainty_merge # 不确定度合成(GUM/Bayes/MC)
- POST /api/v1/quality_eval # 基线评测与显著性
- POST /api/v1/hash_artifact # 工件 HASH 计算
- POST /api/v1/sign_artifact # 工件签名/验签
- POST /api/v1/publish_release # 发布流程编排
- POST /api/v1/revoke_release # 撤回/更正流程
(以上接口与第15章 Schema/Lint、及第11–14章流程一致。)
IV. 通用请求/响应与鉴权
request_envelope:
headers:
x-eift-key: "<token>" # HMAC 或 OIDC Bearer
x-eift-idempotency: "<uuid>" # 幂等键(至少 24h 保留)
content-type: "application/json"
body:
card: { ... } # dataset_card 对象(YAML/JSON)
artifacts?: [{path, bytes_b64?, sha256?}]
response_envelope:
status: "ok" | "error" | "warn"
errors: [{code, message, path?, see?}]
warnings:[{code, message, path?, see?}]
metrics: { ... } # 统计/用量/覆盖
version: "datasetcards.v1"
security:
auth: "OIDC bearer | HMAC"
tls: "TLS1.2+"
scope: ["validate","publish","admin"]
rate_limits:
per_key_per_minute: 120
burst: 60
(鉴权与幂等键用于防重放与安全审计;发布接口需 publish 范围。)
V. 规范性 OpenAPI 摘录
openapi: 3.0.3
info: {title: "EFT DatasetCards API", version: "v1"}
paths:
/api/v1/validate_card:
post:
summary: Validate dataset card against schema & cross-volume constraints
requestBody: {required: true, content: {"application/json": {schema: {$ref: "#/components/schemas/CardEnvelope"}}}}
responses:
"200": {description: "Result", content: {"application/json": {schema: {$ref: "#/components/schemas/Result"}}}}
/api/v1/check_units:
post:
summary: Dimensional/unit consistency check (Core.Metrology v1.0)
responses: {"200": {content: {"application/json": {schema: {$ref: "#/components/schemas/Result"}}}}}
components:
schemas:
CardEnvelope:
type: object
required: [card]
properties:
card: {}
Result:
type: object
properties:
status: {type: string, enum: [ok, warn, error]}
errors: {type: array, items: {$ref: "#/components/schemas/Issue"}}
warnings: {type: array, items: {$ref: "#/components/schemas/Issue"}}
metrics: {type: object}
Issue:
type: object
properties:
code: {type: string}
message: {type: string}
path: {type: string}
see: {type: array, items: {type: string}}
(计量与引用条目在 see[] 使用“卷名 vX.Y:锚点”。)
VI. 接口语义与规则(节选)
- /validate_card(阻断)
- 校验:结构必填、类型/正则、export_manifest.references[] 引用格式、最小计量集。
- 失败即返回 status="error" 与首要阻断清单。
- /lint_card
- 执行第15章规则:比例和、泄漏策略、符号冲突与数学表达禁中文等。
- /check_units
- 基于《Core.Metrology v1.0》:units="SI"、check_dim=true、矢量/张量单位一致性。
- /verify_references
- 校验 "^[^:]+ v\\d+\\.\\d+:[A-Z].+$";可选“锚点可达性”(离线索引/目录服务)。
- /split_freeze_check
- 检查 splits.*.ratio 求和=1±1e-6;索引冻结;与 sampling.rates 一致性。
- /leakage_audit
- 对象/时间窗级去重;跨 split 泄漏=阻断。
- /coverage_report
- 生成分层覆盖表与 95% 区间(默认 Bootstrap-BCa)。
- /uncertainty_merge
- GUM/线性化/蒙特卡罗合成;确保单位归一再合成。
- /quality_eval
- 基线评测、显著性检验与稳健性条目(固定冻结切分)。
- /hash_artifact 与 /sign_artifact
- 计算 sha256 与生成/验证签名;与 export_manifest.artifacts[] 对表。
- /publish_release 与 /revoke_release
- 依第14章:语义化版本、稳定线、撤回墓碑与公告。
VII. 错误码与诊断(规范性)
errors:
- {code:"ESCHEMA001", message:"missing required field", path:"$.title", see:["EFT.WP.Core.DataSpec v1.0:EXPORT"]}
- {code:"EREF001", message:"invalid reference format", path:"$.export_manifest.references[2]", see:["Citation v0.1:P/S/M/I-*"]} :contentReference[oaicite:20]{index=20}
- {code:"EMETR001", message:"units must be SI and check_dim=true", path:"$.metrology", see:["Core.Metrology v1.0:check_dim"]} :contentReference[oaicite:21]{index=21}
- {code:"ESPLIT001", message:"ratios must sum to 1±1e-6", path:"$.splits", see:["Ch.11 Splits"]} :contentReference[oaicite:22]{index=22}
- {code:"ELEAK000", message:"cross-split leakage detected", path:"$.splits.policy", see:["Ch.11 Splits"]} :contentReference[oaicite:23]{index=23}
- {code:"EPATH020", message:"path-dependent T_arr requires delta_form/path/measure", path:"$.path_dependence", see:["Core.Equations v1.1:S20-1"]} :contentReference[oaicite:24]{index=24}
- {code:"ESIGN001", message:"artifact signature mismatch", path:"$.export_manifest.artifacts[0]"}
VIII. 幂等性、版本协商与兼容性
idempotency:
header: "x-eift-idempotency"
window_hours: 24
versioning:
api: "datasetcards.v1" # 破坏性变更提升 MAJOR
minor: "向后兼容新增"
compatibility:
request_backward: "minor+patch"
response_fields: "新字段仅追加,不移除"
(与第14章“语义化版本与稳定线”一致。)
IX. 安全、审计与合规
- 鉴权:OIDC/HMAC;传输:TLS1.2+;最小权限:按 scope 粒度发放。
- 审计:记录 request_id、idempotency_key、调用者、时间戳、摘要;日志纳入合规模块与导出工件。
- 合规:区域限制与数据主体权利对接第13章,撤回/更正对接第14章。
X. 机器可读实现片段(参考 Ixx-?)
# I16-1
def validate_card(card: dict) -> dict: ...
def lint_card(card: dict, rules: dict) -> dict: ...
def check_units(card: dict) -> dict: ... # uses Core.Metrology v1.0:check_dim
def verify_references(card: dict) -> dict: ...# regex + anchor reachability
def split_freeze_check(card: dict) -> dict: ...
def leakage_audit(card: dict) -> dict: ...
def coverage_report(card: dict) -> dict: ...
def uncertainty_merge(card: dict, mode: str="GUM") -> dict: ...
def quality_eval(card: dict, seeds=(0,1,2,3,4)) -> dict: ...
def hash_artifact(path: str|bytes) -> dict: ...
def sign_artifact(path: str|bytes, key_id: str) -> dict: ...
def publish_release(card: dict, channel="stable") -> dict: ...
def revoke_release(tag: str, reason: str) -> dict: ...
(返回统一 {"ok": bool, "errors": [...], "warnings": [...], "metrics": {...}}。)
XI. 示例调用(可直接使用)
curl -X POST https://api.eift.org/api/v1/validate_card \
-H "Authorization: Bearer <token>" \
-H "x-eift-idempotency: 7b7a0b1e-0a21-4f3f-9d0b-3b1e9b1f3c22" \
-H "Content-Type: application/json" \
-d '{"card": { "dataset_id":"eift.obs.demo", "version":"v1.0", "metrology":{"units":"SI","c_ref":299792458,"check_dim":true}, "export_manifest":{"version":"v1.0","artifacts":[],"references":["EFT.WP.Core.DataSpec v1.0:EXPORT","EFT.WP.Core.Metrology v1.0:check_dim"]}}}'
(引用锚点格式与导出清单一致,携带卷名+版本+锚点。)
XII. 与导出清单的耦合(规范性)
export_manifest:
artifacts:
- {path:"api/openapi.yaml", sha256:"..."}
- {path:"api/clients/python.tar.gz", sha256:"..."}
references:
- "EFT.WP.Core.DataSpec v1.0:EXPORT"
- "EFT.WP.Core.Metrology v1.0:check_dim"
(API 说明与客户端产物必须可校验并携带引用锚点。)
XIII. 本章合规自检
- 所有阻断接口(validate_card、check_units、verify_references、split_freeze_check、leakage_audit)已实现并返回统一结构;鉴权/幂等/速率限制启用。
- 引用锚点采用“卷名 vX.Y:锚点”,并在导出清单 references[] 中体现;禁止短码与无版本引用。
- 计量校核遵循《Core.Metrology v1.0》,units="SI" 与 check_dim=true 为强制;涉及 T_arr 登记 delta_form/path/measure 并通过一致性校验。
- 发布/撤回与第14章版本化策略一致;API 工件(OpenAPI/SDK)列入导出清单并可校验。
版权与许可(CC BY 4.0)
版权声明:除另有说明外,《能量丝理论》(含文本、图表、插图、符号与公式)的著作权由作者(“屠广林”先生)享有。
许可方式:本作品采用 Creative Commons 署名 4.0 国际许可协议(CC BY 4.0)进行许可;在注明作者与来源的前提下,允许为商业或非商业目的进行复制、转载、节选、改编与再分发。
署名格式(建议):作者:“屠广林”;作品:《能量丝理论》;来源:energyfilament.org;许可证:CC BY 4.0。
首次发布: 2025-11-11|当前版本:v5.1
协议链接:https://creativecommons.org/licenses/by/4.0/