目录 / 文档-技术白皮书 / 44-EFT.WP.Data.ModelCards v1.0
I. 章节目的与范围
与报告口径,涵盖公平性轴与差异阈值、伦理披露、允许与禁止用途、区域与合规限制、线上监测与整改流程;确保与《评测协议与指标》《训练数据与采样绑定》《预处理与特征工程》以及数据卡的隐私/合规模块一致。规范性定义固化模型卡中的 fairness、ethics 与 usage 的II. 术语与依赖
- 依赖卷:数据契约/导出《EFT.WP.Core.DataSpec v1.0》;计量与量纲校核《EFT.WP.Core.Metrology v1.0》;数据事实与覆盖/切分《EFT.WP.Data.DatasetCards v1.0》(隐私、伦理与合规)。
- 数学与符号:公平性差异度量采用 gap_metric(如 abs_diff 或 ratio);内联符号用反引号(如 Δ_gap),含除号/积分/复合算符必须加括号;公式/符号/定义禁用中文。
III. 字段与结构(规范性)
fairness:
axes: ["class","region","device","language"] # 评估轴
gap_metric: "abs_diff|ratio" # 差异度量
threshold: 0.05 # 公平性阈值
stratification: ["val","test"] # 分层集合
mitigation:
enabled: true
methods: ["reweight","resample","calibration","post-hoc-threshold"]
reeval_required: true
reporting:
include_ci: true # 指标配 95% CI
table_axes: ["axis","bucket","metric"]
significance: {test:"bootstrap", alpha:0.05}
ethics:
intended_use: ["academic","benchmark","safety-research"] # 允许用途
prohibited_use: ["surveillance","biometric_identification","unlawful_discrimination"]
disclosures:
sensitive_attributes: ["N/A"] # 如涉及则列出处理方式
human_in_the_loop: true
risk_notes: "Model outputs must be reviewed in high-stakes contexts."
governance:
review_process: ["internal-ethics-board"]
update_policy: "on-drift|on-incident|quarterly"
usage:
regional_compliance: ["EU-GDPR","US-CCPA"] # 区域/法规限制
access_control:
roles: ["owner","maintainer","reader"]
enforcement: ["signed-url","token","ip-allowlist"]
rate_limits:
qps_max: 1000
burst: 200
monitoring:
online_checks: ["fairness_gap","drift_kl","error_rate"]
alert_rules:
- {name:"fairness_gap_breach", rule:"Δ_gap>0.05 for 60m", severity:"high"}
IV. 公平性评测与阈值
- 轴与分桶:axes 明确评估维度(类别/区域/设备/语言等),与数据卡 coverage 的分层映射一致;每个桶需报告样本数与 95% 置信区间。
- 差异度量:
- 绝对差:Δ_gap = ( metric_ref - metric_grp );
- 比值差:Δ_gap = ( metric_grp / metric_ref );
超阈 threshold 视为阻断项或需给出缓解与再评测结果。
- 缓解与复评:记录 mitigation.methods(重加权/重采样/校准/后处理阈值)与 reeval_required=true;复评遵循第11章协议(seeds/repeats/ci/significance)。
V. 伦理披露与人机协同
- 用途声明:intended_use/prohibited_use 明确边界;高风险场景须启用 human_in_the_loop=true 并记录人工复核环节。
- 敏感属性:如涉及敏感属性,应声明采集/处理/脱敏方式与限制;禁止对未获同意的敏感属性进行推断或用于决策。
- 治理流程:governance.review_process 记录内部伦理评审;更新策略 update_policy 与事件触发条件明确。
VI. 使用限制与线上监测
- 区域合规:regional_compliance 与数据卡一致;涉及跨境传输需在导出清单提供相应机制/模板编号。
- 访问与速率:access_control 与 rate_limits 明确;生产监测使用 monitoring.online_checks 与 alert_rules,超阈需触发降级或阻断。
VII. 计量与单位(如涉性能/时延/能耗)
线上公平性监测如涉及性能/能耗指标,需声明单位并通过 check_dim;跨桶比较需保证度量一致与可比。VIII. 机器可读片段(可直接嵌入)
fairness:
axes: ["class","region","device"]
gap_metric: "abs_diff"
threshold: 0.05
stratification: ["test"]
mitigation: {enabled:true, methods:["reweight","calibration"], reeval_required:true}
reporting: {include_ci:true, table_axes:["axis","bucket","metric"], significance:{test:"bootstrap", alpha:0.05}}
ethics:
intended_use: ["academic","benchmark"]
prohibited_use: ["surveillance","biometric_identification"]
disclosures: {sensitive_attributes:["N/A"], human_in_the_loop:true, risk_notes:"Human review required in high-stakes."}
governance: {review_process:["internal-ethics-board"], update_policy:"on-drift"}
usage:
regional_compliance: ["EU-GDPR"]
access_control: {roles:["owner","maintainer","reader"], enforcement:["signed-url","token"]}
rate_limits: {qps_max: 1000, burst: 200}
monitoring:
online_checks: ["fairness_gap","drift_kl"]
alert_rules: [{name:"fairness_gap_breach", rule:"Δ_gap>0.05 for 60m", severity:"high"}]
IX. 与评测协议、训练数据与部署的一致性
- 公平性评测使用冻结切分与分层映射(映射到数据卡 coverage);
- 若采用缓解策略(重加权/后处理阈值),需在 optimization/hyperparams 与 evaluation 同步记录并做显著性检验;
- 线上监测与部署端点定义在第16章 API 绑定中给出 OpenAPI 片段。
X. 导出清单与审计轨
export_manifest:
artifacts:
- {path:"fairness/by_axis_metrics.csv", sha256:"..."}
- {path:"fairness/mitigation_report.md", sha256:"..."}
- {path:"usage/alert_rules.yaml", sha256:"..."}
references:
- "EFT.WP.Core.DataSpec v1.0:EXPORT"
- "EFT.WP.Core.Metrology v1.0:check_dim"
- "EFT.WP.Data.DatasetCards v1.0:Ch.13"
可校验且与模型卡字段一致;引用采用“卷名 vX.Y:锚点”。必须公平性/伦理/使用相关工件XI. 本章合规自检
- axes/gap_metric/threshold 明确且与数据卡分层一致;各桶报告样本数与 95% CI,并进行显著性检验。
- 触发阈值后已给出缓解方案并复评;结果达标或标注剩余风险与使用限制。
- 伦理披露与人机协同流程已记录;禁止用途清晰且在部署层强制。
- 区域合规、访问控制与线上监测规则完备;性能/能耗等单位通过 check_dim。
- 导出清单列出表格/报告/告警配置并具 sha256;引用携带“卷名 vX.Y:锚点”。
版权与许可(CC BY 4.0)
版权声明:除另有说明外,《能量丝理论》(含文本、图表、插图、符号与公式)的著作权由作者(“屠广林”先生)享有。
许可方式:本作品采用 Creative Commons 署名 4.0 国际许可协议(CC BY 4.0)进行许可;在注明作者与来源的前提下,允许为商业或非商业目的进行复制、转载、节选、改编与再分发。
署名格式(建议):作者:“屠广林”;作品:《能量丝理论》;来源:energyfilament.org;许可证:CC BY 4.0。
首次发布: 2025-11-11|当前版本:v5.1
协议链接:https://creativecommons.org/licenses/by/4.0/