目录文档-技术白皮书46-EFT.WP.Data.Benchmarks v1.0

第13章 公平性、伦理与安全应激


I. 章节目的与范围

安全应激(safety stress)**在基准中的规范:切片与差异度量、危害样本与滥用边界、政策与门槛、报告与仲裁、与评分/门槛/排行榜治理的联动;确保与任务定义、指标体系、评测协议、隐私与合规、计量与引用锚点一致。、**伦理(ethics)公平性(fairness)固化

II. 术语与依赖

  1. 术语:slices(切片)、gap_metric(差异度量:abs_diff|ratio|stat_parity|eq_opp)、harm_suite(危害样本集)、policy(允许/禁止/限制)、gating(发布门槛)、red_team(红队/应激)、incident(事件/违规)、appeal(申诉)。
  2. 依赖:隐私、安全与合规(《Pipeline v1.0》第14章)、评测协议(《ModelCards v1.0》第11章)、指标与单位(本卷第6章)、鲁棒与对抗(本卷第12章)、单位与量纲核验(《Core.Metrology v1.0:check_dim》)。
  3. 数学与符号:内联符号一律用反引号;含除号/积分/复合算符必须加括号;若涉路径量 T_arr,采用
    • T_arr = ( 1 / c_ref ) * ( ∫ n_eff d ell ) 或
    • T_arr = ( ∫ ( n_eff / c_ref ) d ell ),并声明 gamma(ell) 与 d ell;公式/符号/定义禁用中文

III. 字段与结构(规范性)

fairness_ethics:

slices: # 评估切片(可多维笛卡尔积)

- {axis:"locale", buckets:["en","zh","es"]}

- {axis:"gender*", buckets:["f","m","other"], note:"if legally permissible and de-identified"} # 示例

- {axis:"device", buckets:["mobile","desktop"]}

gap_metric: "abs_diff|ratio|stat_parity|eq_opp"

thresholds:

fairness_warn: 0.03 # 预警阈值

fairness_block: 0.05 # 阻断阈值

harms:

harm_suite_ref: "safety/harm_suite@vX.Y"

categories: ["toxicity","self-harm","privacy","misinfo","bias"]

scoring: ["toxicity@prob","privacy_leak@binary","prompt_injection@binary"]

policies:

allowed_use: ["academic","benchmark"]

prohibited_use:["surveillance","unlawful_discrimination"]

restricted_use:["medical_advice","financial_advice"]

red_team:

enabled: true

playbooks: ["redteam/prompt_injection.md","redteam/toxicity.md"]

exposure: {shadow:true, canary:0.02}

reporting:

table_axes: ["axis","bucket","metric"]

include_ci: true

significance: {method:"bootstrap", B:10000, alpha:0.05, correction:"Holm-Bonferroni"}

disclosures:

sensitive_attributes: "de-identified|N/A"

human_in_the_loop: true

governance:

gating: {require_ci:true, min_runs:3}

incident:

notify: "security@org.example"

sla_hours: 72

appeal_window_days: 14


IV. 公平性切片与差异度量

  1. 切片:选择与任务相关且合法合规的维度(如 locale/device/region),对可能敏感维度需去标识与合法性审查。
  2. 差异度量
    • abs_diff = ( metric_ref - metric_grp );
    • ratio = ( metric_grp / metric_ref );
    • stat_parity/eq_opp 等采用协议中定义的正例/阈值口径;
      报告 Δ 与 CI_95,并进行多重比较校正。

V. 危害样本与安全应激


VI. 门槛与治理


VII. 统计与报告


VIII. 计量与单位(SI)


IX. 机器可读片段(可直接嵌入)

fairness_ethics:

slices:

- {axis:"locale", buckets:["en","zh","es"]}

- {axis:"device", buckets:["mobile","desktop"]}

gap_metric: "abs_diff"

thresholds: {fairness_warn:0.03, fairness_block:0.05}

harms:

harm_suite_ref: "safety/harm_suite@v1.1"

categories: ["toxicity","privacy","prompt_injection"]

scoring: ["toxicity@prob","privacy_leak@binary","prompt_injection@binary"]

policies:

allowed_use: ["academic","benchmark"]

prohibited_use: ["surveillance"]

red_team:

enabled: true

playbooks: ["redteam/prompt_injection.md"]

exposure: {shadow:true, canary:0.02}

reporting:

table_axes: ["axis","bucket","metric"]

include_ci: true

significance: {method:"bootstrap", B:10000, alpha:0.05, correction:"Holm-Bonferroni"}

disclosures: {sensitive_attributes:"de-identified", human_in_the_loop:true}

governance:

gating: {require_ci:true, min_runs:3}

incident: {notify:"security@org.example", sla_hours:72, appeal_window_days:14}

metrology: {units:"SI", check_dim:true}


X. Lint 规则(节选,规范性)

lint_rules:

- id: FAIR.SLICES_DEFINED

when: "$.fairness_ethics.slices"

assert: "len(value) >= 1 and all(has_keys(_, 'axis','buckets') for _ in value)"

level: error

- id: FAIR.GAP_METRIC_ALLOWED

when: "$.fairness_ethics.gap_metric"

assert: "value in ['abs_diff','ratio','stat_parity','eq_opp']"

level: error

- id: HARM.SUITE_REF_REQUIRED

when: "$.fairness_ethics.harms"

assert: "has_key(value, 'harm_suite_ref')"

level: error

- id: GOVERN.GATING_PARAMS

when: "$.fairness_ethics.governance.gating"

assert: "has_keys(require_ci, min_runs)"

level: error

- id: REPORT.SIGNIFICANCE_PARAMS

when: "$.fairness_ethics.reporting.significance"

assert: "has_keys(method, B, alpha)"

level: error

- id: METROLOGY.SI_AND_CHECKDIM

when: "$.metrology"

assert: "units == 'SI' and check_dim == true"

level: error


XI. 交叉引用锚点


XII. 本章合规自检


版权与许可(CC BY 4.0)

版权声明:除另有说明外,《能量丝理论》(含文本、图表、插图、符号与公式)的著作权由作者(“屠广林”先生)享有。
许可方式:本作品采用 Creative Commons 署名 4.0 国际许可协议(CC BY 4.0)进行许可;在注明作者与来源的前提下,允许为商业或非商业目的进行复制、转载、节选、改编与再分发。
署名格式(建议):作者:“屠广林”;作品:《能量丝理论》;来源:energyfilament.org;许可证:CC BY 4.0。

首次发布: 2025-11-11|当前版本:v5.1
协议链接:https://creativecommons.org/licenses/by/4.0/