目录文档-技术白皮书06-EFT.WP.Core.DataSpec v1.0

附录A 模式注册表模式


I. 目标与范围


II. 对象模型概览

  1. SchemaRegistryRecord(简称 SRef)
    • 语义:一次可发布的数据模式登记记录。
    • 关系:SRef 引用若干 FieldSpec、ConstraintSpec、IndexSpec、GovernanceSpec、PrivacySpec。
  2. 统一引用:SRef.id 作为跨卷引用键;register_schema(...) -> SRef。

III. SRef 顶层字段(必填项与约束)


IV. FieldSpec(字段字典)

  1. name : str(required)
  2. type : str(required)
    允许集合:{"int32","int64","float32","float64","decimal(p,s)","bool","string","bytes","timestamp(UTC)","date","struct","list<T>","map<K,V>","categorical","geometry"}。
  3. unit : str|None(optional)
    SI 或派生单位文本,如 "m", "s", "m/s"。
  4. dim : str|None(optional)
    量纲标记,如 "L", "T", "L T^-1", "1"。
  5. nullable : bool(required)
  6. default : any|None(optional)
  7. pii_level : str(required)
    允许集合:{"none","low","moderate","high"}。
  8. desc : str(required)
  9. aliases : list[str]|None(optional)
  10. enum : list[any]|None(optional)
  11. tags : list[str]|None(optional)
  12. quality_weight : float|None(optional, in [0,1])
  13. 约束规则:
    • 若 unit 存在,则 dim 必给出且与 check_dim(expr) 一致。
    • timestamp(UTC) 字段须声明时区为 UTC。

V. IndexSpec(二级索引)


VI. ConstraintSpec(契约模板)


VII. GovernanceSpec(治理与发布)


VIII. PrivacySpec(隐私分级与策略)


IX. ProvenanceSpec(追溯与指纹)


X. QualityGateSpec(质量闸门)


XI. 单位与量纲映射规则

  1. 若 equations 涉及 T_arr,两口径均需声明并校验:
    • T_arr_const = ( 1 / c_ref_value ) * ( ∫_gamma n_eff d ell )。
    • T_arr_integrand = ( ∫_gamma ( n_eff / c_ref_value ) d ell )。
  2. dim(n_eff) = 1,dim(c_ref) = L/T,dim( ( ∫_gamma · d ell ) ) = L,因此 dim(T_arr_*) = T。
  3. delta_form = | T_arr_const - T_arr_integrand |,单位为 "s"。

XII. 登记示例(YAML,精简可用)

name: DS.TARR.PathIntegral

version: "1.0"

title: Arrival-time along path integrals

description: Arrival time T_arr computed along gamma(ell) with dual-form check.

fields:

- { name: pid, type: string, unit: null, dim: null, nullable: false, pii_level: "none", desc: "path id" }

- { name: seg_id, type: int32, unit: null, dim: null, nullable: false, pii_level: "none", desc: "segment id" }

- { name: ts, type: timestamp(UTC), unit: "s", dim: "T", nullable: false, pii_level: "none", desc: "UTC time" }

- { name: CRS, type: string, unit: null, dim: null, nullable: false, pii_level: "none", desc: "coord ref sys" }

- { name: ell_start, type: float64, unit: "m", dim: "L", nullable: false, pii_level: "none", desc: "path coord start" }

- { name: ell_end, type: float64, unit: "m", dim: "L", nullable: false, pii_level: "none", desc: "path coord end" }

- { name: n_eff_mean, type: float64, unit: "1", dim: "1", nullable: false, pii_level: "none", desc: "mean effective index" }

- { name: c_ref_ref, type: string, unit: null, dim: null, nullable: false, pii_level: "none", desc: "parameter ref" }

- { name: c_ref_value,type: float64, unit: "m/s",dim: "L T^-1", nullable: false, pii_level: "none", desc: "resolved c_ref" }

- { name: T_arr_const,type: float64, unit: "s", dim: "T", nullable: false, pii_level: "none", desc: "const-pulled form" }

- { name: T_arr_integrand,type: float64, unit: "s", dim: "T", nullable: false, pii_level: "none", desc: "general integrand form" }

- { name: delta_form, type: float64, unit: "s", dim: "T", nullable: false, pii_level: "none", desc: "dual-form gap" }

- { name: q_score, type: float64, unit: "1", dim: "1", nullable: false, pii_level: "none", desc: "quality score" }

- { name: hash_sha256,type: string, unit: null, dim: null, nullable: false, pii_level: "none", desc: "checksum" }

- { name: signature, type: string, unit: null, dim: null, nullable: true, pii_level: "none", desc: "signature" }

pk: ["pid","seg_id"]

idx:

- { keys: ["ts"], kind: "btree", unique: false, desc: "time scan" }

- { keys: ["pid","seg_id"], kind: "btree", unique: true, desc: "segment lookup" }

constraints:

- { kind: "unique", expr: "unique(pid,seg_id)", severity: "ERROR", message: "pk must be unique" }

- { kind: "monotonic", expr: "ell_end >= ell_start", severity: "ERROR", message: "ell non-decreasing" }

- { kind: "dim_check", expr: "check_dim(T_arr_const)=='T'", severity: "ERROR", message: "dim(T_arr_const)=T" }

- { kind: "dim_check", expr: "check_dim(T_arr_integrand)=='T'", severity: "ERROR", message: "dim(T_arr_integrand)=T" }

- { kind: "arrivaltime_dualform", expr: "delta_form <= tol_Tarr", params: { tol_Tarr: "1e-9 s" }, severity: "WARN", message: "dual form mismatch" }

equations: ["S610-1","S610-2"]

parameters: ["c_ref_ref","n_eff_model_ref"]

governance:

owner: "team.eft-data"

steward: "user:alice"

retention_days: 3650

sla: { freshness_max: "P1D", availability_target: "99.9%" }

release: { freeze_policy: "immutable", signing_key: "key://k1" }

privacy:

classification: { pid: "none", seg_id: "none", ts: "none", CRS: "none" }

anonymization: { }

masking: { }

exceptions: [ ]

provenance:

trace: ["sensor.S1","method.integrate_path","artifact.T_arr_v1.parquet"]

checksum: { algo: "sha256", field: "hash_sha256" }

signature: { keyref: "key://k1", field: "signature" }

quality_gates:

q_score_min: 0.80

delta_form_max: "1e-9 s"

completeness_min: 0.98

drift_method: "KL"

drift_max: 0.02

see: ["Core.Equations §S610","Core.Parameters §P3x","Core.Metrology §Mx-?","Core.Errors §I50"]


XIII. 注册与导出(I60 对接)


XIV. 校验要点(发布前 Checklist)


XV. 常见错误与对策(与《Core.Errors》联动)


XVI. 与到达时两口径的专用约束

  1. 定义:
    • T_arr_const = ( 1 / c_ref_value ) * ( ∫_gamma n_eff d ell )。
    • T_arr_integrand = ( ∫_gamma ( n_eff / c_ref_value ) d ell )。
    • delta_form = | T_arr_const - T_arr_integrand |。
  2. 契约:
    • kind="arrivaltime_dualform",expr="delta_form <= tol_Tarr",params={"tol_Tarr":"<time>"}。
    • 发布闸门:delta_form_max 写入 quality_gates 并在 assert_contract 中强制。

XVII. 兼容性与变更登记片段(供发布记录引用)


XVIII. 最小可行模板(YAML,占位符)

name: DS.<DOMAIN>.<Subject>

version: "X.Y"

title: <human-readable title>

description: <what this dataset is for>

fields: [ { name: <f>, type: <t>, unit: <u|null>, dim: <d|null>, nullable: <bool>, pii_level: <level>, desc: <text> }, ... ]

pk: [ <field1>, <field2> ]

idx: [ { keys: [<f1>,<f2>], kind: "btree", unique: false } ]

constraints: [ { kind: "unique", expr: "unique(<k1>,<k2>)", severity: "ERROR", message: "pk unique" } ]

equations: [ ]

parameters: [ ]

units: { }

dims: { }

governance: { owner: "<team>", steward: "<user>", retention_days: <int>, sla: { freshness_max: "P?D", availability_target: "99.9%" }, release: { freeze_policy: "immutable" } }

privacy: { classification: { }, anonymization: { }, masking: { }, exceptions: [ ] }

provenance: { trace: [ ], checksum: { algo: "sha256", field: "hash_sha256" }, signature: { keyref: "key://...", field: "signature" } }

quality_gates: { q_score_min: 0.8, delta_form_max: "1e-9 s", completeness_min: 0.98, drift_method: "KL", drift_max: 0.02 }

see: [ ]


XIX. 小结


版权与许可(CC BY 4.0)

版权声明:除另有说明外,《能量丝理论》(含文本、图表、插图、符号与公式)的著作权由作者(“屠广林”先生)享有。
许可方式:本作品采用 Creative Commons 署名 4.0 国际许可协议(CC BY 4.0)进行许可;在注明作者与来源的前提下,允许为商业或非商业目的进行复制、转载、节选、改编与再分发。
署名格式(建议):作者:“屠广林”;作品:《能量丝理论》;来源:energyfilament.org;许可证:CC BY 4.0。

首次发布: 2025-11-11|当前版本:v5.1
协议链接:https://creativecommons.org/licenses/by/4.0/