目录 / 文档-技术白皮书 / 44-EFT.WP.Data.ModelCards v1.0
I. 章节目的与范围
与落盘口径,覆盖输入/输出字段、shape/dtype/range/semantics、批处理与流式、约束与校核规则;确保与评测协议、部署接口及计量章保持一致。 规范性定义固化 task 与 io_schema 的II. 术语与依赖
- 术语来源:遵循《EFT 技术白皮书与技术备忘模板 全面清单 v0.1》,本章仅增量限定任务—I/O 专用字段。
- 依赖卷:数据契约/导出《Core.DataSpec v1.0》;计量与量纲校核《Core.Metrology v1.0》;与路径相关的表达见《Core.Equations v1.1》。所有内联符号用反引号包裹(如 f_θ(x)、p(y|x,θ)),含除号/积分/复合算符必须加括号;公式/符号/定义禁用中文。
III. 任务类型与 task 字段(规范性)
task: "<classification|retrieval|generation|asr|segmentation|detection|timeseries|forecasting|ranking|regression>"
;若为多任务模型,使用数组并在 io_schema 为每个任务提供命名子模式。一致task 必须与 evaluation.protocol、io_schema 的 I/O 契约IV. io_schema 顶层结构(规范性)
io_schema:
version: "v1.0"
inputs:
- {name:"<string>", shape:"<(…)>", dtype:"<fp32|int8|uint8|…>", range:"<[lo,hi] 或 N/A>", semantics:"<tokenized|waveform|rgb|…>"}
outputs:
- {name:"<string>", shape:"<(…)>", dtype:"<fp32|int8|…>", range:"<[lo,hi]|[0,1]|N/A>", semantics:"<logits|probs|classes|spans|boxes|…>"}
batching: {mode:"<static|dynamic>", max_batch:<int>}
streaming: {enabled:<bool>, chunk_ms:<int?>, lookahead_ms:<int?>}
constraints:
- {type:"shape_compatible", of:["inputs[0]","outputs[0]"]}
- {type:"range", target:"outputs[probs]", rule:"[0,1] & sum==1±1e-6"}
see:
- "EFT.WP.Core.Metrology v1.0:check_dim"
shape/dtype/range 的合法值在 Schema 中固化;分布式/流式场景需在 streaming 显式窗口与延迟字段。V. 任务特定 I/O 模式(规范示例)
- 图像分类(classification)
task: "classification"
io_schema:
inputs:
- {name:"image", shape:"(H,W,3)", dtype:"uint8", range:"[0,255]", semantics:"rgb"}
outputs:
- {name:"probs", shape:"(K,)", dtype:"float32", range:"[0,1]", semantics:"softmax"}
batching: {mode:"dynamic", max_batch: 128}
constraints:
- {type:"range", target:"outputs[probs]", rule:"[0,1] & sum==1±1e-6"}
- 文本生成(generation)
task: "generation"
io_schema:
inputs:
- {name:"tokens_in", shape:"(T_in,)", dtype:"int32", range:"[0,V)", semantics:"tokenized"}
outputs:
- {name:"tokens_out", shape:"(T_out,)", dtype:"int32", range:"[0,V)", semantics:"tokenized"}
- {name:"logprobs", shape:"(T_out,V)", dtype:"float32", range:"(-∞,0]", semantics:"log-softmax"}
streaming: {enabled:true, chunk_ms: 50, lookahead_ms: 0}
- 语音识别(asr)
task: "asr"
io_schema:
inputs:
- {name:"waveform", shape:"(T,)", dtype:"float32", range:"[-1,1]", semantics:"pcm"}
outputs:
- {name:"text", shape:"()", dtype:"string", range:"N/A", semantics:"utf-8"}
constraints:
- {type:"sampling_rate", target:"inputs[waveform]", rule:"f_samp==16000 Hz"}
- 目标检测(detection)
task: "detection"
io_schema:
inputs:
- {name:"image", shape:"(H,W,3)", dtype:"uint8", range:"[0,255]", semantics:"rgb"}
outputs:
- {name:"boxes", shape:"(N,4)", dtype:"float32", range:"[0,1]", semantics:"xywh_norm"}
- {name:"scores", shape:"(N,)", dtype:"float32", range:"[0,1]", semantics:"objectness"}
- {name:"labels", shape:"(N,)", dtype:"int32", range:"[0,K)", semantics:"class_id"}
constraints:
- {type:"range", target:"outputs[boxes]", rule:"[0,1]"}
- 时间序列预测(timeseries|forecasting)
task: "forecasting"
io_schema:
inputs:
- {name:"series", shape:"(T, C)", dtype:"float32", semantics:"zscore"}
- {name:"time_index", shape:"(T,)", dtype:"int64", semantics:"unix_ms"}
outputs:
- {name:"y_hat", shape:"(H, C)", dtype:"float32", semantics:"forecast"}
- {name:"q_hat", shape:"(H, C, Q)", dtype:"float32", semantics:"quantiles"}
constraints:
- {type:"unit", target:"inputs[series]", rule:"SI; check_dim==true"}
(涉及物理量或时间/频率时,单位与量纲由《Core.Metrology v1.0》统一校核,check_dim=true。)
VI. 多任务与多模态 I/O
task: ["classification","retrieval"]
io_schema:
version: "v1.0"
modes:
classification:
inputs: [{name:"image", shape:"(H,W,3)", dtype:"uint8", range:"[0,255]"}]
outputs: [{name:"probs", shape:"(K,)", dtype:"float32", range:"[0,1]", semantics:"softmax"}]
retrieval:
inputs: [{name:"query_emb", shape:"(D,)", dtype:"float32"}]
outputs: [{name:"doc_ids", shape:"(M,)", dtype:"int64"}]
多任务时,evaluation.metrics 需分任务列出;deployment.forms 若路由不同须在第11/第16章接口中体现。VII. 校核规则与 Lint 约束(规范性)
lint_rules:
- id: IO.RANGE_PROBS
when: "$.io_schema.outputs[?(@.semantics=='softmax')]"
assert: "range == '[0,1]'"
level: error
- id: IO.SHAPE_NONEMPTY
when: "$.io_schema.inputs[*].shape"
assert: "matches('^\\(') and contains(',')"
level: error
- id: IO.DTYPE_ALLOWED
when: "$.io_schema.inputs[*].dtype"
assert: "in_(['float16','float32','int8','int16','int32','uint8','string'])"
level: error
- id: IO.METROLOGY_CHECKDIM
when: "$.io_schema"
assert: "metrology.units=='SI' and metrology.check_dim==true"
level: error
(上述 Lint 与第15章 Schema 配合执行;失败为阻断项。)
VIII. 与评测协议、部署接口的一致性
- evaluation.protocol.splits="frozen";io_schema 的 I/O 名称与评测脚本参数名一致;
- 部署侧 /v1/score 或流式端点的请求/响应 JSON Schema 必须与 io_schema 一致,并在第16章 API 绑定中给出 OpenAPI 片段。
IX. 计量与路径依赖(如适用)
若模型 I/O 涉及路径依赖量(如 T_arr),需在模型卡登记:delta_form、path="gamma(ell)"、measure="d ell";两种等价表达:- T_arr = ( 1 / c_ref ) * ( ∫ n_eff d ell )
- T_arr = ( ∫ ( n_eff / c_ref ) d ell )
并通过 check_dim。
X. 机器可读片段(可直接嵌入)
task: "classification"
io_schema:
version: "v1.0"
inputs: [{name:"image", shape:"(H,W,3)", dtype:"uint8", range:"[0,255]", semantics:"rgb"}]
outputs: [{name:"probs", shape:"(K,)", dtype:"float32", range:"[0,1]", semantics:"softmax"}]
batching: {mode:"dynamic", max_batch: 128}
streaming: {enabled:false}
see:
- "EFT.WP.Core.Metrology v1.0:check_dim"
XI. 本章合规自检
- task 与 io_schema 已定义且与 evaluation.protocol、部署接口一致;多任务场景提供命名子模式。
- io_schema 明确 shape/dtype/range/semantics、批处理与流式口径;涉及物理量的字段已通过 check_dim。
- 软最大输出满足区间与归一约束;所有数学/符号以反引号和括号规范书写,不出现中文。
- 如涉及路径量(如 T_arr),已登记 delta_form/path/measure 并完成计量校核。
版权与许可(CC BY 4.0)
版权声明:除另有说明外,《能量丝理论》(含文本、图表、插图、符号与公式)的著作权由作者(“屠广林”先生)享有。
许可方式:本作品采用 Creative Commons 署名 4.0 国际许可协议(CC BY 4.0)进行许可;在注明作者与来源的前提下,允许为商业或非商业目的进行复制、转载、节选、改编与再分发。
署名格式(建议):作者:“屠广林”;作品:《能量丝理论》;来源:energyfilament.org;许可证:CC BY 4.0。
首次发布: 2025-11-11|当前版本:v5.1
协议链接:https://creativecommons.org/licenses/by/4.0/