Schema Validation — validate.py
validate.py ทำอะไร
Script scripts/validate.py ตรวจสอบว่า subagent output ตรงกับ JSON schema — เลย orchestrator ไม่ได้ consume malformed data
Usage:
python3 scripts/validate.py output.json schema.json
Exit 0 = valid, exit 1 = invalid
Scenario: GL Reader Output
GL Reconciler agent — leaf worker reader อ่านจาก untrusted GL statement document
Output ต้องตรงกับ schema นี้:
{
"type": "object",
"properties": {
"breaks": {
"type": "array",
"items": {
"type": "object",
"properties": {
"trade_date": { "type": "string", "format": "date" },
"break_amount": { "type": "number" },
"description": { "type": "string", "maxLength": 500 }
},
"required": ["trade_date", "break_amount"]
}
}
},
"required": ["breaks"]
}
Reader output:
{
"breaks": [
{
"trade_date": "2026-05-10",
"break_amount": 150000.50,
"description": "FX conversion mismatch"
}
]
}
Validate:
python3 scripts/validate.py reader-output.json schema.json
# OK
Invalid output:
{
"breaks": [
{
"trade_date": "2026-05-10",
"break_amount": "150000.50" # ← ควรเป็น number, ไม่ string!
}
]
}
python3 scripts/validate.py reader-output.json schema.json
# INVALID: '150000.50' is not of type 'number' at /breaks/0/break_amount
Define Schema ใน Subagent Manifest
Subagent yaml มี output_schema block:
# managed-agent-cookbooks/gl-reconciler/subagents/reader.yaml
name: reader
model: claude-opus-4-7
system:
text: "Parse GL statement. Return JSON only."
tools:
- type: agent_toolset_20260401
default_config: { enabled: false }
configs:
- name: read
enabled: true
output_schema:
type: object
properties:
breaks:
type: array
items:
type: object
properties:
trade_date: { type: string, format: date }
break_amount: { type: number }
description: { type: string, maxLength: 500 }
required: ["trade_date", "break_amount"]
required: ["breaks"]
Deploy Script Integration
scripts/deploy-managed-agent.sh extract output_schema จาก subagent yaml:
# ใน deploy-managed-agent.sh
json=$(jq '.output_schema as $schema | ...' agent.yaml)
# เก็บ $schema ไว้
# หลังจาก worker complete:
python3 scripts/validate.py worker-output.json <extracted-schema>
if [[ $? -ne 0 ]]; then
# route error event back to worker
orchestrator.emit("validation_failed", worker_id=..., output=...)
fi
Schema ไฟล์ Type Formats
validate.py รองรับ YAML และ JSON:
# YAML schema
python3 scripts/validate.py output.json schema.yaml
# ✓ reads both formats
# JSON schema
python3 scripts/validate.py output.json schema.json
# ✓ works too
Common Schema Constraints
Prevent untrusted input from reaching downstream:
String Max Length
description:
type: string
maxLength: 500 # ← cap length so no DOS
Enum-only Values
status:
type: string
enum: ["VERIFIED", "UNVERIFIED"] # ← whitelist only
Array Max Items
breaks:
type: array
maxItems: 1000 # ← prevent memory explosion
Numeric Bounds
break_amount:
type: number
minimum: -1e9
maximum: 1e9 # ← reasonable bounds
No Additional Properties
type: object
additionalProperties: false # ← strict, no surprise fields
Testing Output Schema
Before deploying subagent:
# Generate test output (pretend output from worker)
cat > test-reader-output.json << 'EOF'
{
"breaks": [
{
"trade_date": "2026-05-10",
"break_amount": 150000.50
}
]
}
EOF
# Copy schema from subagent manifest
python3 scripts/validate.py test-reader-output.json \
managed-agent-cookbooks/gl-reconciler/subagents/reader-schema.json
# OK → proceed, else fix worker system prompt
Validation in Orchestrator Loop
Reference implementation scripts/orchestrate.py validates worker output:
import subprocess
import json
def validate_worker_output(worker_name, output, schema_path):
"""Validate worker output against schema."""
# Write temp output
with open("/tmp/worker_output.json", "w") as f:
json.dump(output, f)
# Run validate.py
result = subprocess.run(
["python3", "scripts/validate.py",
"/tmp/worker_output.json", schema_path],
capture_output=True
)
if result.returncode != 0:
# Validation failed
error_msg = result.stderr.decode()
emit_event("validation_failed", {
"worker": worker_name,
"error": error_msg,
"output": output
})
return False
return True
Error Message Format
Validation error ชัดเจน:
INVALID: 'string_value' is not of type 'number'
at /breaks/0/break_amount
Components:
'string_value'— actual valueis not of type 'number'— schema constraint/breaks/0/break_amount— path in JSON tree
Disable Validation (Dev Only)
ในช่วง development ถ้า worker output ยังไม่ stable:
# subagent.yaml
output_schema: null # disable validation temporarily
Or in orchestrator:
if os.getenv("SKIP_VALIDATION"):
return True # skip check
But before production deployment — enable validation สำหรับ security ทั้งหมด