目录 / 文档-技术白皮书 / 46-EFT.WP.Data.Benchmarks v1.0
I. 章节目的与范围
安全应激(safety stress)**在基准中的规范:切片与差异度量、危害样本与滥用边界、政策与门槛、报告与仲裁、与评分/门槛/排行榜治理的联动;确保与任务定义、指标体系、评测协议、隐私与合规、计量与引用锚点一致。与、**伦理(ethics)公平性(fairness)固化II. 术语与依赖
- 术语:slices(切片)、gap_metric(差异度量:abs_diff|ratio|stat_parity|eq_opp)、harm_suite(危害样本集)、policy(允许/禁止/限制)、gating(发布门槛)、red_team(红队/应激)、incident(事件/违规)、appeal(申诉)。
- 依赖:隐私、安全与合规(《Pipeline v1.0》第14章)、评测协议(《ModelCards v1.0》第11章)、指标与单位(本卷第6章)、鲁棒与对抗(本卷第12章)、单位与量纲核验(《Core.Metrology v1.0:check_dim》)。
- 数学与符号:内联符号一律用反引号;含除号/积分/复合算符必须加括号;若涉路径量 T_arr,采用
- T_arr = ( 1 / c_ref ) * ( ∫ n_eff d ell ) 或
- T_arr = ( ∫ ( n_eff / c_ref ) d ell ),并声明 gamma(ell) 与 d ell;公式/符号/定义禁用中文。
III. 字段与结构(规范性)
fairness_ethics:
slices: # 评估切片(可多维笛卡尔积)
- {axis:"locale", buckets:["en","zh","es"]}
- {axis:"gender*", buckets:["f","m","other"], note:"if legally permissible and de-identified"} # 示例
- {axis:"device", buckets:["mobile","desktop"]}
gap_metric: "abs_diff|ratio|stat_parity|eq_opp"
thresholds:
fairness_warn: 0.03 # 预警阈值
fairness_block: 0.05 # 阻断阈值
harms:
harm_suite_ref: "safety/harm_suite@vX.Y"
categories: ["toxicity","self-harm","privacy","misinfo","bias"]
scoring: ["toxicity@prob","privacy_leak@binary","prompt_injection@binary"]
policies:
allowed_use: ["academic","benchmark"]
prohibited_use:["surveillance","unlawful_discrimination"]
restricted_use:["medical_advice","financial_advice"]
red_team:
enabled: true
playbooks: ["redteam/prompt_injection.md","redteam/toxicity.md"]
exposure: {shadow:true, canary:0.02}
reporting:
table_axes: ["axis","bucket","metric"]
include_ci: true
significance: {method:"bootstrap", B:10000, alpha:0.05, correction:"Holm-Bonferroni"}
disclosures:
sensitive_attributes: "de-identified|N/A"
human_in_the_loop: true
governance:
gating: {require_ci:true, min_runs:3}
incident:
notify: "security@org.example"
sla_hours: 72
appeal_window_days: 14
IV. 公平性切片与差异度量
- 切片:选择与任务相关且合法合规的维度(如 locale/device/region),对可能敏感维度需去标识与合法性审查。
- 差异度量:
- abs_diff = ( metric_ref - metric_grp );
- ratio = ( metric_grp / metric_ref );
- stat_parity/eq_opp 等采用协议中定义的正例/阈值口径;
报告 Δ 与 CI_95,并进行多重比较校正。
V. 危害样本与安全应激
- 危害样本集:harm_suite_ref 指向可复现的样本与分类规则;覆盖 toxicity/self-harm/privacy/misinfo/bias 等类别。
- 应激与红队:影子或金丝雀暴露;明确通过/阻断判据(如 toxicity@prob<=τ、prompt_injection@binary==0);不合格为阻断。
VI. 门槛与治理
- 门槛联动:若 gap_metric 超过 fairness_block 或危害评分越界,则发布阻断;超过 fairness_warn 需整改与复评。
- 治理流程:记录 incident、处置时限 sla_hours 与 appeal_window_days;对外披露政策与变更记录(不在本章输出)。
VII. 统计与报告
- 显著性:默认 bootstrap (B≥10k, α=0.05),跨桶/多维切片应用 Holm–Bonferroni;
- 报告:表格按 axis/bucket/metric 展开;附 CI_95;提供公平性热图与危害通过率曲线;公开披露 sensitive_attributes/human_in_the_loop。
VIII. 计量与单位(SI)
- 性能与比例:QPS(1/s)、latency_ms.{p50,p95,p99}、ρ(—)、比例/概率使用 —(无量纲)或 %;
- 强制:metrology:{units:"SI", check_dim:true};复合量合成/比较前先单位归一;
- 路径量:若公平性/应激实验涉及 T_arr,登记 delta_form/path/measure 并按等价式通过 check_dim。
IX. 机器可读片段(可直接嵌入)
fairness_ethics:
slices:
- {axis:"locale", buckets:["en","zh","es"]}
- {axis:"device", buckets:["mobile","desktop"]}
gap_metric: "abs_diff"
thresholds: {fairness_warn:0.03, fairness_block:0.05}
harms:
harm_suite_ref: "safety/harm_suite@v1.1"
categories: ["toxicity","privacy","prompt_injection"]
scoring: ["toxicity@prob","privacy_leak@binary","prompt_injection@binary"]
policies:
allowed_use: ["academic","benchmark"]
prohibited_use: ["surveillance"]
red_team:
enabled: true
playbooks: ["redteam/prompt_injection.md"]
exposure: {shadow:true, canary:0.02}
reporting:
table_axes: ["axis","bucket","metric"]
include_ci: true
significance: {method:"bootstrap", B:10000, alpha:0.05, correction:"Holm-Bonferroni"}
disclosures: {sensitive_attributes:"de-identified", human_in_the_loop:true}
governance:
gating: {require_ci:true, min_runs:3}
incident: {notify:"security@org.example", sla_hours:72, appeal_window_days:14}
metrology: {units:"SI", check_dim:true}
X. Lint 规则(节选,规范性)
lint_rules:
- id: FAIR.SLICES_DEFINED
when: "$.fairness_ethics.slices"
assert: "len(value) >= 1 and all(has_keys(_, 'axis','buckets') for _ in value)"
level: error
- id: FAIR.GAP_METRIC_ALLOWED
when: "$.fairness_ethics.gap_metric"
assert: "value in ['abs_diff','ratio','stat_parity','eq_opp']"
level: error
- id: HARM.SUITE_REF_REQUIRED
when: "$.fairness_ethics.harms"
assert: "has_key(value, 'harm_suite_ref')"
level: error
- id: GOVERN.GATING_PARAMS
when: "$.fairness_ethics.governance.gating"
assert: "has_keys(require_ci, min_runs)"
level: error
- id: REPORT.SIGNIFICANCE_PARAMS
when: "$.fairness_ethics.reporting.significance"
assert: "has_keys(method, B, alpha)"
level: error
- id: METROLOGY.SI_AND_CHECKDIM
when: "$.metrology"
assert: "units == 'SI' and check_dim == true"
level: error
XI. 交叉引用锚点
- 指标与单位:见《EFT.WP.Data.Benchmarks v1.0》第6章。
- 评分与门槛:见第8章。
- 评测协议与运行环境:见《EFT.WP.Data.ModelCards v1.0》第11章、本卷第10章。
- 隐私、安全与合规:见《EFT.WP.Data.Pipeline v1.0》第14章。
- 单位与量纲校核:见《EFT.WP.Core.Metrology v1.0:check_dim》。
XII. 本章合规自检
- 切片、差异度量、危害类别与评分口径完整;门槛 fairness_warn/fairness_block 明确。
- 红队/应激的暴露、判据与安全护栏生效;违规事件具 sla_hours 与申诉窗口。
- 显著性与多重比较校正配置有效;报告含切片表、热图与危害曲线。
- SI 计量与 check_dim=true 生效;如涉 T_arr 已登记 delta_form/path/measure 并通过校核。
- 机器可读片段可直接落盘并通过 Lint;export_manifest.references[] 采用“卷名 vX.Y:锚点”。
版权与许可(CC BY 4.0)
版权声明:除另有说明外,《能量丝理论》(含文本、图表、插图、符号与公式)的著作权由作者(“屠广林”先生)享有。
许可方式:本作品采用 Creative Commons 署名 4.0 国际许可协议(CC BY 4.0)进行许可;在注明作者与来源的前提下,允许为商业或非商业目的进行复制、转载、节选、改编与再分发。
署名格式(建议):作者:“屠广林”;作品:《能量丝理论》;来源:energyfilament.org;许可证:CC BY 4.0。
首次发布: 2025-11-11|当前版本:v5.1
协议链接:https://creativecommons.org/licenses/by/4.0/