Back to work
Cost-aware agent runtime · Python
nightshift
Sending every agent subtask to a frontier API burns budget on work a small local model could do.
View sourceWhat I built
A Python agent runtime that routes extraction / retrieval / eval tasks to local models (T5-small, a MiniLM cross-encoder, ChromaDB) behind a confidence gate, with multi-provider dispatch (OpenAI / Anthropic / Google / DeepSeek) and per-call token-cost budgeting.
Stack
PythonChromaDBTransformershttpx
Status
Runtime built; 138 passing tests.