目录文档-技术白皮书51-管线卡 Template v1.0

第4章 入站数据与契约(Source/Schema/版本)


卷名:管线卡 Template v1.0
本节标题:第4章 入站数据与契约(Source/Schema/版本)


I. 目的与范围(Purpose & Scope)


II. 数据源定义(Source Definition)


III. 入站契约(Inbound Contract / Schema)

  1. 与 TARR 对齐:字段命名、单位、量纲与 TARR(数据规范)一致;关键路径字段:
    • path.gamma_ell: array<m>;path.d_ell: array<m>;medium.n_eff_profile: array<1>;
    • ref.c_ref: m/s;(相位类)lambda_ref: m;obs.T_arr: s、obs.Phi: rad(如存在)。
  2. 字段规范(最小集合)
    • 标识:record_id (ULID/UUIDv4)、acq.ts_start/ts_end (ISO-8601)、instrument.id/mode。
    • 质量:quality.flags[]、quality.score_Q (0..1)、uncertainty.*(如可用)。
    • 引用与版本:see[]、references[]、version (SemVer)、checksum。
  3. 单位与量纲:所有数值字段附 unit 或在契约中明确;含除号/积分/复合算符的表达必须加括号
  4. 缺失策略:以 null 或缺字段表示缺失;禁止使用文本 NaN/Inf;缺失原因写入 quality.flags。

IV. 模式演进(Schema Evolution)

  1. 兼容矩阵
    • 向后兼容新增(MINOR):只增字段,旧消费者不破坏;
    • 非兼容变更(MAJOR):字段语义/单位/量纲改变或删除字段;
    • 修补(PATCH):默认值/描述修正,不改语义。
  2. 版本标注:入站数据包与 schema.json 顶层均携带 version (SemVer);manifest.yaml 中同步。
  3. 迁移策略:MAJOR 改动需提供显式映射/换算(示例:rad = deg * π/180)与回滚方案。

V. 校验流程(Validation Pipeline,Mx-?)


VI. 机读契约(Machine-Readable Contracts)
A. schema.json(节选)

{

"$schema":"https://json-schema.org/draft/2020-12/schema",

"title":"Inbound v1.0.0",

"type":"object",

"required":["record_id","acq","path","medium","ref","version","see"],

"properties":{

"record_id":{"type":"string"},

"acq":{"type":"object","required":["ts_start","ts_end"],

"properties":{"ts_start":{"type":"string","format":"date-time"},"ts_end":{"type":"string","format":"date-time"}}},

"path":{"type":"object","required":["gamma_ell","d_ell"],

"properties":{"gamma_ell":{"type":"array","items":{"type":"number"},"minItems":2},

"d_ell":{"type":"array","items":{"type":"number"},"minItems":2}}},

"medium":{"type":"object","required":["n_eff_profile"],

"properties":{"n_eff_profile":{"type":"array","items":{"type":"number"},"minItems":2}}},

"ref":{"type":"object","required":["c_ref"],"properties":{"c_ref":{"type":"number"}}},

"see":{"type":"array","items":{"type":"string"},"minItems":1},

"version":{"type":"string"}

}

}

B. contract.yaml(入站契约说明)

version: "1.0.0"

source:

id: "SRC-telemetry-rt"

mode: "streaming" # file|streaming|near-rt|api

schema: "schemas/inbound/schema.json"

units:

c_ref: "m/s"

T_arr: "s"

Phi: "rad"

path:

required: true

delta_form: "general"

quality_gates: ["G1","G2","G3","G4","G5","G6","G7","G8"]


C. manifest.yaml(入站制品清单)

dataset_id: "ptn-ingest-202509"

version: "1.0.0"

created_at: "2025-09-24T16:00:00Z"

producer: "pipeline.ingest"

see:

- "EFT.WP.Core.Equations v1.1:S20-1"

- "EFT.WP.Core.Metrology v1.0:check_dim"

checksum: { algo: "sha256", value: "<64-hex>" }


VII. 反例与修正(Anti-Patterns & Fixes)


VIII. 校验与告警(Validation & Alerts)


IX. 发布与目录结构(Release & Layout)

PTN_EXPORT/

inbound/

contract.yaml

schema.json

data/

*.parquet

reports/

check_dim_report.json

validate_report.json

audit.jsonl

manifest.yaml

SIGNATURE.asc


X. 交叉引用(Cross-References)


XI. 执行勾选清单(Checklist)


版权与许可(CC BY 4.0)

版权声明:除另有说明外,《能量丝理论》(含文本、图表、插图、符号与公式)的著作权由作者(“屠广林”先生)享有。
许可方式:本作品采用 Creative Commons 署名 4.0 国际许可协议(CC BY 4.0)进行许可;在注明作者与来源的前提下,允许为商业或非商业目的进行复制、转载、节选、改编与再分发。
署名格式(建议):作者:“屠广林”;作品:《能量丝理论》;来源:energyfilament.org;许可证:CC BY 4.0。

首次发布: 2025-11-11|当前版本:v5.1
协议链接:https://creativecommons.org/licenses/by/4.0/