目录文档-技术白皮书52-数据集卡 Template v1.0

第10章 接口、清单与访问(API/Manifest/Access)


I. 目的与范围(Purpose & Scope)


II. 前置条件与输入(Prerequisites & Inputs)


III. 返回信封与错误语义(Return Envelope & Errors)

{

"status":"OK|ERROR",

"code":0,

"error":{"type":"E_*","message":"...","details":{}},

"payload":{},

"metrics":{"latency_ms":123,"retries":0},

"anchors":["EFT.WP.Core.Equations v1.1:S20-1"],

"version":"1.0.0",

"checksum":"sha256:..."

}


IV. REST OpenAPI(节选)

openapi: 3.0.3

info: { title: "Dataset API", version: "1.0.0" }

servers: [{ url: "https://datasets.example.com/api/v1" }]

components:

securitySchemes: { bearerAuth: { type: http, scheme: bearer } }

schemas:

GetRequest:

type: object

properties:

dataset_id: { type: string }

split: { type: string, enum: ["train","val","test","holdout"] }

filter: { type: object }

fields: { type: array, items: { type: string } }

GetResponse:

type: object

properties:

status: { type: string }

payload: { type: array, items: { type: object } }

anchors: { type: array, items: { type: string } }

version: { type: string }

checksum: { type: string }

paths:

/datasets/{id}/info:

get:

security: [{ bearerAuth: [] }]

summary: "Get dataset manifest info"

parameters: [{ in: path, name: id, required: true, schema: { type: string } }]

responses:

"200": { description: "OK" }

/datasets/{id}/get:

post:

security: [{ bearerAuth: [] }]

summary: "Idempotent read of records"

requestBody:

required: true

content: { application/json: { schema: { $ref: "#/components/schemas/GetRequest" } } }

responses:

"200": { description: "OK", content: { application/json: { schema: { $ref: "#/components/schemas/GetResponse" } } } }

"409": { description: "Idempotency conflict" }

/datasets/{id}/validate:

post:

summary: "Validate gates G1–G8 and report stops"

responses: { "200": { description: "Validation report" } }


V. gRPC(proto 节选)

syntax = "proto3";

package dataset.v1;

message GetRequest {

string dataset_id = 1;

string split = 2; // train|val|test|holdout

repeated string fields = 3;

bytes filter = 4; // JSON

}

message GetResponse {

string status = 1;

bytes payload = 2; // Parquet/JSON

repeated string anchors = 3;

string version = 4;

string checksum = 5;

}

service DatasetService {

rpc Info(GetRequest) returns (GetResponse);

rpc Get (GetRequest) returns (GetResponse);

rpc Validate (GetRequest) returns (GetResponse);

}


VI. CLI(节选)

# 清单

ds info ds-core --out reports/manifest_view.json

# 幂等读取

ds get ds-core --split test --fields obs.T_arr,obs.Phi --filter @filters.json \

--idempotency_key run42+p010+win001 --out data/test_subset.parquet

# 门校验

ds validate ds-core --out reports/validate_report.json


VII. 幂等、速率与审计(Idempotency, Rate, Audit)


VIII. Manifest 规范(report_manifest.yaml)

version: "1.0.0"

dataset_id: "ds-core"

schema: "schemas/dataset/schema.json"

contract: "contracts/contract.yaml"

splits: "splits/split_manifest.json"

reports:

- "reports/check_dim_report.json"

- "reports/validate_report.json"

- "reports/audit.jsonl"

figs:

- "figs/scale_dist.pdf"

- "figs/path_profile.pdf"

checksums:

schema: "sha256:..."

contract: "sha256:..."

splits: "sha256:..."

sign: "SIGNATURE.asc"

see:

- "EFT.WP.Core.Metrology v1.0:check_dim"

- "EFT.WP.Core.Equations v1.1:S20-1"


IX. 访问控制与安全(Access Control & Security)


X. 路径量接口规范(Path-Specific)


XI. 质量门映射(Gates Mapping)


XII. 反例与修正(Anti-Patterns & Fixes)


XIII. 交叉引用(Cross-References)


XIV. 执行勾选清单(Checklist)


版权与许可(CC BY 4.0)

版权声明:除另有说明外,《能量丝理论》(含文本、图表、插图、符号与公式)的著作权由作者(“屠广林”先生)享有。
许可方式:本作品采用 Creative Commons 署名 4.0 国际许可协议(CC BY 4.0)进行许可;在注明作者与来源的前提下,允许为商业或非商业目的进行复制、转载、节选、改编与再分发。
署名格式(建议):作者:“屠广林”;作品:《能量丝理论》;来源:energyfilament.org;许可证:CC BY 4.0。

首次发布: 2025-11-11|当前版本:v5.1
协议链接:https://creativecommons.org/licenses/by/4.0/