Guide
Reviewer Toolkit
A static guide for human reviewers evaluating Teleodynamic AI evaluation packets. Covers the review checklist, red flags, claim boundary verification, and evidence assessment procedures.
Human review is required for any claim beyond architectural or framing statusReview Checklist
Work through this checklist for each evaluation packet under review. Mark each item as verified before approving the packet.
| # | Check | What to Look For | Pass Condition |
|---|---|---|---|
| 1 | Packet Identity | packetId, createdUtc, agentSlug are present and valid | All three fields non-empty, timestamps ISO 8601 |
| 2 | Claim Status | claimStatus matches one of 7 valid values from the claim-status matrix | Exact match to an approved status value |
| 3 | No Artificial Life Claims | No field contains claims of artificial life, consciousness, or sentience | Packet is free of prohibited claims |
| 4 | Resource Budget | resourceBudgetSummary has computeBudget, consumed, threshold, status | All 4 sub-fields present with numeric values |
| 5 | Safety Flags | safetyBoundaryFlags contains 6 boolean properties | All 6 flags present with boolean values |
| 6 | Evidence Links | evidenceLinks point to resolvable public URLs | URLs are well-formed; link rot noted separately |
| 7 | Structural Actions | structuralActions entries have op, target, reason, budgetImpact | Each entry has all 4 fields |
| 8 | Caveats Present | caveats field is non-empty and addresses known limitations | Field present with meaningful content |
Red Flags
These patterns warrant immediate escalation or rejection of the evaluation packet. If you encounter any of these, mark the packet as disputed and request clarification.
Claim Boundary Violations
Any claim of artificial life, consciousness, sentience, or biological agency anywhere in the packet. Any claim that Carcinus.org itself is teleodynamic.
Missing Safety Flags
safetyBoundaryFlags object is missing or contains non-boolean values. Absence of the flags section suggests the system is not tracking safety constraints.
Consecutive No-ops Without Explanation
More than 50% of structural proposals rejected as No-ops without clear resource-budget justification. May indicate budget misconfiguration or structural stagnation.
Unresolvable Evidence Links
Evidence links return 404 or point to non-public resources. Evidence must be verifiable by any reviewer with internet access.
Private Credentials Exposed
Any API key, password, token, or private key visible in any packet field. Packets must be public-safe by design.
Convergence Drop Without Explanation
Fast loop convergence metric below 0.7 without accompanying caveats or structural justification. May indicate model degradation.
Evidence Assessment Guidelines
When assessing evidence linked from an evaluation packet:
- Resolvability: Can you open the evidence URL from a standard web browser? If not, note it as unresolvable.
- Relevance: Does the evidence directly support the claim made in the packet? Tangential evidence should be flagged.
- Timeliness: Is the evidence from the same time period as the packet? Stale evidence should be noted.
- Public Accessibility: Can any reviewer (not just the packet author) access it? Paywalled or private evidence provides no public audit value.
- Completeness: Does the evidence cover all claims in the packet, or only a subset? Missing evidence should be documented in reviewer notes.
Review Outcome Guidance
| Outcome | When to Use | Next Step |
|---|---|---|
reviewed | All 8 checklist items pass, no red flags, evidence confirms claims | Packet is approved for public reference |
reviewed-with-notes | Passes checklist but has minor caveats or limitations | Approved with documented caveats in reviewer notes |
in-review | Review is underway; awaiting additional evidence or clarification | Update status when resolution is reached |
disputed | Red flags found, evidence insufficient, or claims are unverifiable | Return to agent with specific disputed items and requested fixes |