System Architecture

How Kalibr works under the hood.


Components

┌─────────────────────────────────────────────────────────────┐
│                        Your Application                      │
│  ┌─────────────┐  ┌─────────────┐  ┌─────────────────────┐  │
│  │ Python SDK  │  │   TS SDK    │  │ Framework Integs    │  │
│  │  (kalibr)   │  │ (@kalibr/sdk│  │ (LangChain/CrewAI)  │  │
│  └──────┬──────┘  └──────┬──────┘  └──────────┬──────────┘  │
└─────────┼────────────────┼───────────────────┼──────────────┘
          │                │                   │
          └────────────────┼───────────────────┘
                           │ HTTPS (NDJSON)
                           ▼
┌─────────────────────────────────────────────────────────────┐
│                    Kalibr Backend                            │
│              api.kalibr.systems:443                          │
│  ┌─────────────────────────────────────────────────────┐    │
│  │  /api/ingest  │  /api/otel/*  │  /api/intelligence  │    │
│  └─────────────────────────────────────────────────────┘    │
│                           │                                  │
│                           ▼                                  │
│  ┌─────────────────────────────────────────────────────┐    │
│  │                   ClickHouse                         │    │
│  │              (traces, outcomes tables)               │    │
│  └─────────────────────────────────────────────────────┘    │
└─────────────────────────────────────────────────────────────┘
                           │
                           │ Aggregation
                           ▼
┌─────────────────────────────────────────────────────────────┐
│                Intelligence Service                          │
│           kalibr-intelligence.fly.dev                        │
│  ┌─────────────────────────────────────────────────────┐    │
│  │   Pattern Engine  │  Recommender  │  Wilson Scoring │    │
│  └─────────────────────────────────────────────────────┘    │
│                           │                                  │
│                           ▼                                  │
│  ┌──────────────┐  ┌──────────────┐                         │
│  │    Redis     │  │  ClickHouse  │                         │
│  │   (cache)    │  │  (patterns)  │                         │
│  └──────────────┘  └──────────────┘                         │
└─────────────────────────────────────────────────────────────┘

Data Flow

  1. SDK captures trace — LLM call metadata (model, tokens, cost, latency)
  2. NDJSON POST to /api/ingest — Batched events sent to backend
  3. Backend validates & enriches — Adds pricing, validates schema
  4. ClickHouse storage — Inserted into traces table
  5. Pattern aggregation — Every 5 minutes, patterns computed
  6. Intelligence queries — get_policy() queries patterns + outcomes
  7. Dashboard display — React frontend queries /api/otel/*

SDKs

Python SDK (kalibr v1.2.0)

Auto-instrumentationOpenAI, Anthropic, Google SDKs
OpenTelemetryOTLP export + local JSONL fallback
Intelligenceget_policy(), report_outcome()
Dependencieshttpx, opentelemetry, tiktoken

TypeScript SDK (@kalibr/sdk v1.0.0)

PatternSpanBuilder (manual span creation)
DependenciesZero (uses native fetch)
RuntimesNode.js 18+, Edge, Bun
FormatsCJS + ESM dual build

Backend

Stack: FastAPI, ClickHouse (native protocol), Clerk (auth)

Key Routes

RoutePurpose
/api/ingestEvent ingestion (NDJSON/JSON)
/api/otel/spansQuery spans with filters
/api/otel/metricsAggregated metrics
/api/intelligence/*Proxy to intelligence service
/api/capsules/*Cross-service trace propagation
/api/runtimes/*Runtime registry

Background Jobs

Daily aggregation00:15 UTC — Compute daily summaries
Daily export00:30 UTC — Export to JSONL

Intelligence Service

Deployed: kalibr-intelligence.fly.dev (separate microservice)

Features

  • Pattern Engine — Aggregates traces into model performance patterns
  • Recommender — Wilson scoring for statistical confidence
  • Outcome Tracking — Stores success/failure outcomes
  • Pareto Frontier — Identifies optimal cost/quality tradeoffs

API Routes

Base: /api/v1/intelligence

POST /policyGoal-based model recommendation
POST /recommendTask-based model recommendation
POST /report-outcomeRecord outcome feedback
GET /patterns/{task_type}Get aggregated patterns
POST /aggregateTrigger manual aggregation

Storage

ClickHouse

Time-series database for traces. Native protocol on port 9000.

traces table

CREATE TABLE kalibr.traces (
  event_date Date,
  trace_id String,
  span_id String,
  parent_span_id String,
  tenant_id String,
  ts_start DateTime64(3),
  ts_end DateTime64(3),
  duration_ms UInt32,
  provider String,
  model_id String,
  operation String,
  input_tokens UInt32,
  output_tokens UInt32,
  cost_est_usd Float64,
  status String,
  error_type String,
  error_message String,
  ...
) ENGINE = MergeTree()
PARTITION BY toYYYYMM(event_date)
ORDER BY (tenant_id, ts_start, trace_id)

outcomes table

CREATE TABLE kalibr.outcomes (
  trace_id String,
  tenant_id String,
  goal String,
  success UInt8,
  score Float64,
  failure_reason String,
  metadata String,
  created_at DateTime DEFAULT now()
) ENGINE = MergeTree()
ORDER BY (tenant_id, goal, created_at)

Deployment

Cloud (Managed)

BackendFly.io
IntelligenceFly.io (separate app)
ClickHouseClickHouse Cloud
FrontendVercel
AuthClerk

Security

  • TLS 1.3 — All connections encrypted
  • API Keys — HMAC-verified, stored hashed
  • Tenant Isolation — All queries filtered by tenant_id
  • Clerk SSO — Dashboard authentication

Next Steps