The 12 goal types Kalibr uses for classification and routing. Input type, output type, and cognitive load determine the goal_id. The goal_id determines the default path order and the success contract.
| goal_id | Input to Output | Load | Default path order | Success contract |
|---|---|---|---|---|
| web_scraping | URL to rows | low | DeepSeek, Llama, Mixtral, gpt-4o-mini | field_completeness >= 0.8, min 1 row |
| data_enrichment | rows to rows | low | DeepSeek, Llama, Qwen, gpt-4o-mini | null_rate_after < null_rate_before |
| lead_scoring | text to score | low | DeepSeek, Llama, Mixtral, gpt-4o-mini | score numeric, in [0, 100] |
| classification | text to label | low | DeepSeek, Llama, Qwen, gpt-4o-mini | label in allowed_labels |
| summarization | text to prose | low | DeepSeek, Llama, Mixtral, claude-haiku | compression ratio 0.05 to 0.4 |
| data_pipeline | data to rows | low | DeepSeek, Llama, Qwen, gpt-4o-mini | rows_out > 0, no exception |
| research | text to synthesis | medium | Llama, DeepSeek, deepseek-r1, claude-sonnet | structural: min 200 chars, no error markers + float judge 20% |
| outreach_generation | rows to content | medium | Llama, DeepSeek, Mixtral, claude-sonnet | structural: subject + body present, 50-2000 chars + float judge 20% |
| code_generation | any to code | high | Sonnet, GPT-4o, o3-mini, deepseek-r1 | AST parse passes or tests_pass = True |
| code_review | code to prose | high | Sonnet, GPT-4o, deepseek-r1, o3-mini | min 50 chars of structured feedback |
| system_design | any to prose | high | Sonnet, deepseek-r1, GPT-4o, o3-mini | min 200 chars of structured output |
| agent_orchestration | multi to coordinates | high | Sonnet, GPT-4o, deepseek-r1, o3-mini | subtasks_completed = True, no timeout |
The listed order is the cold-start default: cheapest capable model first. Thompson Sampling shifts traffic away from this order based on actual outcomes in your production environment. After roughly 50 outcomes per goal_id, the router has enough signal to start making confident routing decisions. The default order only matters during the cold-start phase.
Conversational replies, status checks, config changes, memory operations, and simple lookups. These carry no signal worth routing.