44-EFT.WP.Data.ModelCards v1.0 | 第7章模型架构与参数

目录／文档-技术白皮书（V5.05）／ 44-EFT.WP.Data.ModelCards v1.0

第7章模型架构与参数

I. 章节目的与范围

、计数口径与可复现实施要点，覆盖骨干/头部/模块化组合、参数量 M_param、激活/归一化/位置编码、初始化与精度策略、正则化与结构化压缩衔接；确保与任务 I/O、评测协议及计量章一致。 规范性定义固化 architecture 与关联参数的

II. 术语与依赖

术语来源：遵循《EFT 技术白皮书与技术备忘模板全面清单 v0.1》，本章仅增量定义与架构/参数直接相关字段与约束。
依赖卷：数据契约/导出《Core.DataSpec v1.0》；计量与量纲校核《Core.Metrology v1.0》；涉及路径/到达时的推导见《Core.Equations v1.1》。所有内联符号以反引号书写，含除号/积分/复合算符必须加括号，公式/符号/定义禁用中文。

III. 字段与结构（规范性）

architecture:

version: "v1.0"

backbone: "<string>" # 例：resnet50|vit-b|conformer-xs|transformer-base

topology:

stages: # 模块/阶段清单（顺序即拓扑）

- {name:"stem", type:"conv", params:{out:64, k:7, s:2, norm:"bn", act:"relu"}}

- {name:"stage1", type:"resblk", repeat:3, params:{out:256, bottleneck: true}}

- {name:"stage2", type:"resblk", repeat:4, params:{out:512}}

- {name:"head", type:"linear", params:{out_dim:1000}}

positional_encoding: {type:"sinusoidal|learned|none", dim: 768?}

norm: {type:"bn|ln|rmsnorm", eps:1e-5, affine:true}

act: {type:"relu|gelu|silu|tanh"}

dropout: {p: 0.1}

attention: {type:"msa|lsa|flash", heads:12?, window:16?}

mixed_precision: {train:"fp16|bf16|fp32", infer:"fp16|bf16|fp32", loss_scale:"dynamic|static|none"}

init:

scheme: "kaiming_uniform|xavier_normal|trunc_normal"

seed: 1701

params_report:

M_param: 25.6 # 以百万计（M）

FLOPs: 4.1e9 # 推理单样本

T_inf: 3.8 # ms/样本（批=1，设备/驱动版本需另记）

constraints:

grad_ckpt: true

amp_safe_ops: ["conv","gemm"]

see:

- "EFT.WP.Core.Metrology v1.0:check_dim"

（M_param/FLOPs/T_inf 的单位与口径由计量章统一校核；任何与 I/O 相关形状需与第6章一致。）

IV. 参数计数与计量口径

参数量 M_param：统计可训练参数数目，默认不含优化器状态；若含冻结参数或稀疏掩码，需在 params_report.notes 说明。
复杂度 FLOPs：以单样本前向为基准；若与序列/分辨率相关，给出函数式或名义点（如 224×224、T=16000）。
时延 T_inf：记录设备/批大小/框架与内核版本；单位 ms；需可复现（同环境 5 次中位数）。
量纲守恒：上述数值及电源/显存等指标在 Schema 层声明单位并通过 check_dim。

V. 模块目录与约束（常见类型）

卷积/残差块：conv/resblk；参数：out,k,s,p,bottleneck；规范残差连接与下采样触发条件。
Transformer 编码器/解码器：self_attn/cross_attn/ffn；约束：heads * head_dim = model_dim；掩码与因果性在 constraints 体现。
归一化/激活：统一在 norm/act 顶层声明；如层内覆写需在对应模块 params 指出。
位置编码：sinusoidal|learned|rope 等；与输入长度/补齐策略一致。
注意力实现：msa（标准）|lsa（局部）|flash（内核优化）；说明窗口与内存复杂度改善口径。

VI. 初始化、精度与设备策略

初始化 init.scheme：列出权重与偏置策略；若存在分布式/分片初始化，提供伪随机图与种子矩阵。
混合精度 mixed_precision：训练/推理精度与损失缩放；列出不安全算子白/黑名单与回退规则。
梯度检查点/重计算：在 constraints.grad_ckpt 标记；说明对 FLOPs/T_inf 的影响口径。

VII. 正则化与结构化手段（与第5章压缩衔接）

权重衰减/丢弃：在 optimization/architecture.dropout 明确；
结构化稀疏/通道剪枝：若已启用，放入 compression 并在本章 constraints 反映对拓扑/维度的影响；
蒸馏对接：compression.distillation.teacher 与 architecture.backbone 对表，确保维度兼容。

VIII. 与任务 I/O、评测协议的一致性

形状一致：architecture.topology 的空间/时间步变换与第6章 io_schema 对齐；
评测复现：对评测使用的检查点与权重哈希在 export_manifest.artifacts[] 列出，并在 evaluation 记 seed/repeats。

IX. 机器可读片段（可直接嵌入）

architecture:

version: "v1.0"

backbone: "vit-b"

topology:

- {name:"patchify", type:"conv", params:{k:16, s:16, out:768}}

- {name:"enc1", type:"transformer_block", repeat:12,

params:{dim:768, heads:12, mlp_ratio:4.0, act:"gelu", norm:"ln"}}

- {name:"head", type:"linear", params:{out_dim:1000}}

positional_encoding: {type:"sinusoidal", dim:768}

mixed_precision: {train:"bf16", infer:"bf16", loss_scale:"dynamic"}

init: {scheme:"trunc_normal", seed:1701}

params_report: {M_param: 86.6, FLOPs: 1.8e10, T_inf: 6.2}

X. 与路径依赖量的衔接（如适用）

若架构含路径相关算子/子网（如可学习折射率映射或时延估计头），在模型卡登记：

path_dependence.delta_form、path="gamma(ell)"、measure="d ell"；
T_arr 两种等价表达：
- T_arr = ( 1 / c_ref ) * ( ∫ n_eff d ell )
- T_arr = ( ∫ ( n_eff / c_ref ) d ell )；
  并通过 check_dim。

XI. 机器可读 Schema 片段（规范性）

# I15-7 Architecture & Params (excerpt)

properties:

architecture:

type: object

required: [version, backbone, topology]

properties:

version: {type: string}

backbone:{type: string}

topology: {type: array, items:{type: object, properties:{

name:{type:string}, type:{type:string}, repeat:{type:integer},

params:{type:object}}}}

positional_encoding: {type: object}

norm: {type: object}

act: {type: object}

dropout: {type: object}

attention: {type: object}

mixed_precision: {type: object}

init: {type: object, properties:{scheme:{type:string}, seed:{type:integer}}}

params_report: {type: object, properties:{M_param:{type:number}, FLOPs:{type:number}, T_inf:{type:number}}}

constraints: {type: object}

（单位/量纲在 Schema 层声明并由《Core.Metrology v1.0》校核；引用锚点采用“卷名 vX.Y:锚点”格式。）

XII. 本章合规自检

architecture 已完整声明骨干/拓扑/初始化/精度与参数报告，且与第6章 I/O 及评测脚本一致。
M_param/FLOPs/T_inf 明确单位与测量口径并通过 check_dim；设备/批大小/环境锁定可复现。
若存在路径依赖量，已登记 delta_form/path/measure 并通过计量校核；所有公式使用反引号与括号，不出现中文。
与 compression/explainability 等扩展的接口边界清晰，导出工件与哈希在 export_manifest 对表。

版权与许可：除另有说明外，《能量丝理论》（含文本、图表、插图、符号与公式）的著作权由作者（屠广林）享有。
许可方式（CC BY 4.0）：在注明作者与来源的前提下，允许复制、转载、节选、改编与再分发。
署名格式（建议）：作者：屠广林｜作品：《能量丝理论》｜来源：energyfilament.org｜许可证：CC BY 4.0
验证召集： 作者独立自费、无雇主无资助；下一阶段将优先在最愿意公开讨论、公开复现、公开挑错的环境中推进落地，不限国家。欢迎各国媒体与同行抓住窗口组织验证，并与我们联系。
版本信息： 首次发布：2025-11-11 ｜当前版本：v6.0+5.05