目录文档-技术白皮书44-EFT.WP.Data.ModelCards v1.0

第7章 模型架构与参数


I. 章节目的与范围

、计数口径与可复现实施要点,覆盖骨干/头部/模块化组合、参数量 M_param、激活/归一化/位置编码、初始化与精度策略、正则化与结构化压缩衔接;确保与任务 I/O、评测协议及计量章一致。 规范性定义固化 architecture 与关联参数的

II. 术语与依赖


III. 字段与结构(规范性)

architecture:

version: "v1.0"

backbone: "<string>" # 例:resnet50|vit-b|conformer-xs|transformer-base

topology:

stages: # 模块/阶段清单(顺序即拓扑)

- {name:"stem", type:"conv", params:{out:64, k:7, s:2, norm:"bn", act:"relu"}}

- {name:"stage1", type:"resblk", repeat:3, params:{out:256, bottleneck: true}}

- {name:"stage2", type:"resblk", repeat:4, params:{out:512}}

- {name:"head", type:"linear", params:{out_dim:1000}}

positional_encoding: {type:"sinusoidal|learned|none", dim: 768?}

norm: {type:"bn|ln|rmsnorm", eps:1e-5, affine:true}

act: {type:"relu|gelu|silu|tanh"}

dropout: {p: 0.1}

attention: {type:"msa|lsa|flash", heads:12?, window:16?}

mixed_precision: {train:"fp16|bf16|fp32", infer:"fp16|bf16|fp32", loss_scale:"dynamic|static|none"}

init:

scheme: "kaiming_uniform|xavier_normal|trunc_normal"

seed: 1701

params_report:

M_param: 25.6 # 以百万计(M)

FLOPs: 4.1e9 # 推理单样本

T_inf: 3.8 # ms/样本(批=1,设备/驱动版本需另记)

constraints:

grad_ckpt: true

amp_safe_ops: ["conv","gemm"]

see:

- "EFT.WP.Core.Metrology v1.0:check_dim"

(M_param/FLOPs/T_inf 的单位与口径由计量章统一校核;任何与 I/O 相关形状需与第6章一致。)


IV. 参数计数与计量口径


V. 模块目录与约束(常见类型)


VI. 初始化、精度与设备策略


VII. 正则化与结构化手段(与第5章压缩衔接)


VIII. 与任务 I/O、评测协议的一致性


IX. 机器可读片段(可直接嵌入)

architecture:

version: "v1.0"

backbone: "vit-b"

topology:

- {name:"patchify", type:"conv", params:{k:16, s:16, out:768}}

- {name:"enc1", type:"transformer_block", repeat:12,

params:{dim:768, heads:12, mlp_ratio:4.0, act:"gelu", norm:"ln"}}

- {name:"head", type:"linear", params:{out_dim:1000}}

positional_encoding: {type:"sinusoidal", dim:768}

mixed_precision: {train:"bf16", infer:"bf16", loss_scale:"dynamic"}

init: {scheme:"trunc_normal", seed:1701}

params_report: {M_param: 86.6, FLOPs: 1.8e10, T_inf: 6.2}


X. 与路径依赖量的衔接(如适用)

若架构含路径相关算子/子网(如可学习折射率映射或时延估计头),在模型卡登记:
  1. path_dependence.delta_form、path="gamma(ell)"、measure="d ell";
  2. T_arr 两种等价表达:
    • T_arr = ( 1 / c_ref ) * ( ∫ n_eff d ell )
    • T_arr = ( ∫ ( n_eff / c_ref ) d ell );
      并通过 check_dim。

XI. 机器可读 Schema 片段(规范性)

# I15-7 Architecture & Params (excerpt)

properties:

architecture:

type: object

required: [version, backbone, topology]

properties:

version: {type: string}

backbone:{type: string}

topology: {type: array, items:{type: object, properties:{

name:{type:string}, type:{type:string}, repeat:{type:integer},

params:{type:object}}}}

positional_encoding: {type: object}

norm: {type: object}

act: {type: object}

dropout: {type: object}

attention: {type: object}

mixed_precision: {type: object}

init: {type: object, properties:{scheme:{type:string}, seed:{type:integer}}}

params_report: {type: object, properties:{M_param:{type:number}, FLOPs:{type:number}, T_inf:{type:number}}}

constraints: {type: object}

(单位/量纲在 Schema 层声明并由《Core.Metrology v1.0》校核;引用锚点采用“卷名 vX.Y:锚点”格式。)


XII. 本章合规自检


版权与许可(CC BY 4.0)

版权声明:除另有说明外,《能量丝理论》(含文本、图表、插图、符号与公式)的著作权由作者(“屠广林”先生)享有。
许可方式:本作品采用 Creative Commons 署名 4.0 国际许可协议(CC BY 4.0)进行许可;在注明作者与来源的前提下,允许为商业或非商业目的进行复制、转载、节选、改编与再分发。
署名格式(建议):作者:“屠广林”;作品:《能量丝理论》;来源:energyfilament.org;许可证:CC BY 4.0。

首次发布: 2025-11-11|当前版本:v5.1
协议链接:https://creativecommons.org/licenses/by/4.0/