目录文档-技术白皮书43-EFT.WP.Data.DatasetCards v1.0

第12章 质量评估与基线


I. 章节目的与范围

禁用中文固化质量门(通过标准)、覆盖率指标与基线任务/指标的统一口径;定义评测协议、统计显著性与复现实验要求;与切分、标签、本体、计量与不确定度保持一致。所有键名使用 snake_case;跨卷引用采用“卷名+版本+锚点”,数学表达使用反引号与括号,

II. 术语与依赖


III. 字段与结构(规范性)

quality:

gates: # 质量门(发布需全部通过)

- {name: "label_consistency", threshold: 0.98, metric: "kappa"}

- {name: "leakage", threshold: 0.0, metric: "leakage_rate"}

- {name: "coverage_min", threshold: 0.99, metric: "split_coverage"}

- {name: "checksum_integrity", threshold: 1.0, metric: "sha256_ok_ratio"}

coverage: # 覆盖与分布监测

samples: 0 # 发布时替换为实际样本数

per_class: {} # {"FRB": 520, "RFI": 2100, ...}

per_region: {} # 空间/站点/通道等维度

ci_method: "bootstrap-bca" # 置信区间方法

target_ci: 0.95

baseline:

tasks: # 基线任务清单(分类/检索/回归/检测…)

- {name:"cls_frb_vs_rfi", type:"classification", split:"test"}

metrics: # 指标与定义

- {name:"accuracy"}

- {name:"f1_macro"}

- {name:"roc_auc"}

- {name:"pr_auc"}

- {name:"ece"} # Expected Calibration Error

- {name:"brier"}

- {name:"rmse"} # 回归/时序类

- {name:"map"} # 检测/检索类

eval_protocol: # 评测协议

splits: "frozen" # 必须使用冻结切分

seeds: [0,1,2,3,4]

repeats: 5

ci: {method:"bootstrap-bca", level:0.95}

significance: {test:"permutation", alpha:0.05}

fairness: {by:["class","region"], gap_metric:"abs_diff"}

robustness: {shift_tests:["snr_drop","time_jitter","spec_notch"]}

reports: # 产出与可追溯

tables: ["quality/summary.csv","quality/per_class.csv"]

plots: ["quality/roc.png","quality/pr.png","quality/calibration.png"]

see:

- "EFT.WP.Core.DataSpec v1.0:EXPORT"

- "EFT.WP.Core.Metrology v1.0:check_dim"

(与第11章冻结切分、与第8章标签/本体、与第9–10章计量/不确定度相互一致。)


IV. 质量门(Gates)定义


V. 覆盖率与分布监测


VI. 基线任务与指标定义


VII. 评测协议(Eval Protocol)


VIII. 与不确定度/计量的耦合

(来自第10章)时,需先在单位与量纲上完成归一,再合成报告;对 T_arr 等路径依赖量的指标,登记 delta_form、path="gamma(ell)"、measure="d ell" 并通过 check_dim。计量不确定度(重采样/Bootstrap)与统计不确定度报告模型输出的

IX. 报告与可追溯


X. 机器可读片段(可直接嵌入卡片)

quality:

gates:

- {name:"label_consistency", metric:"kappa", threshold:0.98}

- {name:"leakage", metric:"leakage_rate", threshold:0.0}

- {name:"coverage_min", metric:"split_coverage", threshold:0.99}

coverage:

samples: 15000

per_class: {"FRB":520, "RFI":2100, "Noise":12380}

ci_method: "bootstrap-bca"

target_ci: 0.95

baseline:

tasks:

- {name:"cls_frb_vs_rfi", type:"classification", split:"test"}

metrics: [{name:"f1_macro"}, {name:"roc_auc"}, {name:"ece"}, {name:"brier"}]

eval_protocol:

splits: "frozen"

seeds: [0,1,2,3,4]

repeats: 5

ci: {method:"bootstrap-bca", level:0.95}

significance: {test:"permutation", alpha:0.05}

robustness: {shift_tests:["snr_drop","time_jitter","spec_notch"]}

reports:

tables: ["quality/summary.csv","quality/per_class.csv"]

plots: ["quality/roc.png","quality/pr.png","quality/calibration.png"]

see:

- "EFT.WP.Core.DataSpec v1.0:EXPORT"

- "EFT.WP.Core.Metrology v1.0:check_dim"

(与导出清单 export_manifest.artifacts[]、references[] 对表。)


XI. 与导出清单的耦合(规范性)

export_manifest:

artifacts:

- {path:"quality/summary.csv", sha256:"..."}

- {path:"quality/per_class.csv", sha256:"..."}

- {path:"quality/roc.png", sha256:"..."}

- {path:"quality/calibration.png", sha256:"..."}

references:

- "EFT.WP.Core.DataSpec v1.0:EXPORT"

- "EFT.WP.Core.Metrology v1.0:check_dim"

(工件必须可校验并携带引用锚点;禁止短码/别名。)


XII. 本章合规自检


版权与许可(CC BY 4.0)

版权声明:除另有说明外,《能量丝理论》(含文本、图表、插图、符号与公式)的著作权由作者(“屠广林”先生)享有。
许可方式:本作品采用 Creative Commons 署名 4.0 国际许可协议(CC BY 4.0)进行许可;在注明作者与来源的前提下,允许为商业或非商业目的进行复制、转载、节选、改编与再分发。
署名格式(建议):作者:“屠广林”;作品:《能量丝理论》;来源:energyfilament.org;许可证:CC BY 4.0。

首次发布: 2025-11-11|当前版本:v5.1
协议链接:https://creativecommons.org/licenses/by/4.0/