附录 C. 数据/模型/管线卡


I. 摘要与范围
本附录定义三类卡片 DatasetCard、ModelCard、PipelineCard 的统一模式(Schema)、必填字段、单位与量纲(Unit/Dim)携带规则、引用锚点(see:)、哈希与版本化、校验与发布流程。任一涉及到达时(ToA)的字段必须并行记录两种口径并显式声明路径 gamma(ell) 与测度 d ell:T_arr^A = ( 1 / c_ref ) * ( ∫ n_eff d ell )、T_arr^B = ( ∫ ( n_eff / c_ref ) d ell ),并登记 delta_form。

II. 依赖与引用
统一符号与量纲:附录 A、第二章 P12-;数据与管线发布:第十四章 I75-/M75-;仿真与基准:第十二章 M70-;推断与证伪:第十三章 M72-;API 绑定:第十五章 I80-;通道/谱/传输:第三–八章(S20-/S30-/S40-/S50-/S52-);GRB/FRB:第十–十一章(M62-/M64-)。

III. 统一模式与命名(I75- 基线)


IV. DatasetCard(数据卡)模式与示例


表 C-1 DatasetCard 字段一览(必填为★)

区段

字段

说明

示例/单位

★meta

dataset_id, version, instrument, band, time_span

基本元数据

"CR_ARRAY_V3", "3.2.0", "CTA", "0.1–10 TeV", "2021-2024"

★spec

columns[{name, unit, dim, description, see}]

列定义(含 Unit/Dim)

Phi(E): m^-2·s^-1·sr^-1·eV^-1

sampling, calibration

采样/标定口径

"live_time_weighted"

★quality

systematics, covariance, masks

系统学与协方差、掩码

masks: dominant_band

integrals

path, measure

ToA 路径与测度

"gamma(ell)", "d ell"

ToA

T_arr^A, T_arr^B, delta_form

双口径 ToA

s, s, "A"

★hash

data_hash, card_hash

完整性

sha256:*

★see

anchors

物理/方法锚点

S50-*, S52-*


JSON 示例(节选)

JSON json
{
  "meta": {
    "dataset_id": "CR_ARRAY_V3",
    "version": "3.2.0",
    "instrument": "CTA",
    "band": "0.1–10 TeV",
    "time_span": "2021-2024"
  },
  "spec": {
    "columns": [
      { "name": "E", "unit": "eV", "dim": "E", "description": "energy", "see": "S50-*" },
      {
        "name": "Phi",
        "unit": "m^-2·s^-1·sr^-1·eV^-1",
        "dim": "L^-2 T^-1 Ω^-1 E^-1",
        "description": "diff. flux",
        "see": "S50-9"
      }
    ],
    "sampling": "live_time_weighted",
    "calibration": "v2_gain_map"
  },
  "quality": { "systematics": "5% abs", "covariance": "provided", "masks": [ "dominant_band" ] },
  "integrals": { "path": "gamma(ell)", "measure": "d ell" },
  "toa": { "T_arr^A": "1.234e-3 s", "T_arr^B": "1.236e-3 s", "delta_form": "A" },
  "hash": { "data_hash": "sha256:...", "card_hash": "sha256:..." },
  "see": [ "S50-*", "S52-*" ]
}

V. ModelCard(模型卡)模式与示例


表 C-2 ModelCard 字段一览(必填为★)

区段

字段

说明

示例/单位

★meta

model_id, version, family

模型族与版本

"S50_spectrum_v1"

★params

{name, transform, prior, bounds, unit, dim}

参见附录 B

alpha_inj, E_max, D0...

hyper

hierarchy

分层超参

μ_α, σ_α

channels

{A_rec, A_shear, A_dsa, A_turb}

通道开关/默认权重

{"A_rec":1,"A_shear":1}

diagnostics

evidence, IC

证据与信息准则

logZ, WAIC, LOO

★hash

code_hash, card_hash

追溯

sha256:*

★see

anchors

方法锚点

S30-*/S40-*/S50-*/S52-*


JSON 示例(节选)

JSON json
{
  "meta": { "model_id": "S50_spectrum_v1", "version": "1.1.0", "family": "S50" },
  "params": [
    {
      "name": "alpha_inj",
      "transform": "identity",
      "prior": "N(2.2,0.3)",
      "bounds": [ 1.0, 3.5 ],
      "unit": "1",
      "dim": "1",
      "see": "S50-*"
    },
    {
      "name": "E_max",
      "transform": "log",
      "prior": "LogU[1e9,1e21]",
      "bounds": [ 1000000.0, 1e+22 ],
      "unit": "eV",
      "dim": "E",
      "see": "S50-7"
    }
  ],
  "channels": { "A_rec": 1, "A_shear": 1, "A_dsa": 0, "A_turb": 0 },
  "diagnostics": { "logZ": 7.11, "WAIC": 512.3, "LOO": 510.8 },
  "hash": { "code_hash": "sha256:...", "card_hash": "sha256:..." },
  "see": [ "S30-*", "S40-*", "S50-*", "S52-*" ]
}

VI. PipelineCard(管线卡)模式与示例


表 C-3 PipelineCard 字段一览(必填为★)

区段

字段

说明

示例

★meta

pipeline_id, version

标识

"ASTROACC_FIT_V2"

★graph

nodes[], edges[]

DAG(type∈{ingest, calibrate, simulate, fit, validate, export})

见下

node[i]

{type, inputs, outputs, image/env, seed, resources}

节点配置

"env: docker://..."

★acceptance

thresholds

验收门限映射第十二章

SpecMAE, ToAΔ ...

★exports

{products/, metrics.json, masks/, delta_form.log, repro/}

导出布局

固定目录结构

provenance

{who, when, where}

溯源

★hash

{code_hash, data_hash}

完整性

★see

anchors

对应方法/数据锚点

M70-*/M72-*/I75-*


JSON 示例(节选)

JSON json
{
  "meta": { "pipeline_id": "ASTROACC_FIT_V2", "version": "2.0.0" },
  "graph": {
    "nodes": [
      {
        "id": "n1",
        "type": "ingest",
        "inputs": [ "cards/dataset_cr.json" ],
        "outputs": [ "staged/" ],
        "env": "docker://acc:1.3",
        "seed": 1729
      },
      {
        "id": "n2",
        "type": "fit",
        "inputs": [ "staged/", "cards/model_s50.json" ],
        "outputs": [ "products/posterior.zarr" ],
        "env": "docker://acc:1.3",
        "seed": 1729
      },
      {
        "id": "n3",
        "type": "validate",
        "inputs": [ "products/posterior.zarr" ],
        "outputs": [ "metrics.json" ],
        "env": "docker://acc:1.3"
      }
    ],
    "edges": [ [ "n1", "n2" ], [ "n2", "n3" ] ]
  },
  "acceptance": { "SpecMAE": "<=0.03", "IndexErr": "<=0.05", "ToAΔ": "<=1e-4 s" },
  "exports": {
    "products/": true,
    "metrics.json": true,
    "masks/": true,
    "delta_form.log": true,
    "repro/": true
  },
  "hash": { "code_hash": "sha256:...", "data_hash": "sha256:..." },
  "see": [ "M70-*", "M72-*", "I75-*" ]
}

VII. 校验与一致性(M75-1)

VIII. 发布与审计(M75-2/M75-4/M75-5)

IX. 质量门与验收(对接第十二章)

X. 模板清单


表 C-4 DatasetCard 列模板(摘录)

name

unit

dim

description

see

E

eV

E

energy

S50-*

Phi

m^-2·s^-1·sr^-1·eV^-1

L^-2 T^-1 Ω^-1 E^-1

differential flux

S50-9

T_arr^A

s

T

ToA form A

S50-9

T_arr^B

s

T

ToA form B

S52-8


表 C-5 ModelCard 参数模板(摘录)

name

transform

prior

bounds

unit

dim

see

alpha_inj

identity

N(2.2,0.3)

[1,3.5]

1

1

S50-*

E_max

log

LogU[1e9,1e21]

[1e6,1e22]

eV

E

S50-7


表 C-6 Pipeline 节点模板(摘录)

type

inputs

outputs

必填

说明

ingest

DatasetCard

staged/

校验与标准化

fit

staged/, ModelCard

posterior, evidence

推断与证伪

validate

products/

metrics.json

验收门限

export

targets

bundle

注册与发布

XI. 接口对接(I80-*)

XII. 小结
本附录以 I75-/M75- 为基线,给出 DatasetCard/ModelCard/PipelineCard 的统一模式、字段与示例,明确 Unit/Dim、ToA 双口径、哈希与审计、验收门限与发布流程,确保数据—模型—管线在全链路内可验证、可复现、可审计。