~/shifat santo/systems ai engineer

AI agents that
behave less like demos.

// scroll to recompile ↓

I build the unglamorous parts that make AI agents actually work: memory, tool use, evaluation, and a recovery path for when they confidently do something dumb. Fewer demos, more software that survives contact with real users.

Selected Work

Systems that remember, route, evaluate, and recover.

A few systems built around AI: voice, agents, language models, applied cryptography, and the MCP tooling that ties it together. Each one ships, runs, or has a paper attached. 18 projects in total.

voice agent · production

Soniq

Voice agents that book the table, not the meeting after. Sub-second turn latency, multi-tenant by default.

live call · #4a2f

3:42·tenant · Lucas Bistro

userbook a table for two tomorrow 7:30

agent7:30 is open. shall I confirm?

useryes please

voice to voice

~450ms

pipeline

STT 150ms
LLM 200ms
TOOL 60ms
TTS 40ms

sentimentpositive

Deepgram·Cartesia·LiveKit·7 industry presets

306M-param Bengali LLM from scratch

Kotha-1

Pretrained Bengali decoder LM, end to end.

PythonPyTorchAccelerateSentencePiece

params

306M

layers

tokenizer

32k BPE

Operational MCP toolbelt · private

private

smart-mcps

Operational tools agents can call safely.

TypeScriptMCPZodOAuth

tests

1,900+

packages

tools

60+

Professor-intelligence MCP server · live APIs

ProfGraph

Professor ratings, teaching style, and grade odds, as LLM tools.

PythonMCPGraphQLStarlette

tools

tests

apis

2 live

Autonomous research pipeline · LangGraph

Research-Agent

A research graph that plans, searches, synthesizes, and recovers.

PythonLangGraphvLLMArXiv

tests

269

nodes

backends

vLLM/Groq

Encrypted capture & evidence vault · private

private

Blackbox

Always-on, offline, encrypted, tamper-evident audio capture.

Pythonage/X25519ShamirOpenTimestamps

tests

500+

crypto

X25519/age

3.11/3.12

See all 18 projects

18 systems · Dallas, TX · 2026

Systems I Build

Calm surfaces.
Serious machinery.

The pattern repeats across the work: context enters, tools act, results get measured, and useful state survives the session.

Memory layers

Local files, notes, screenshots, and decisions become retrievable state instead of repeated context tax.

Agent infrastructure

MCP servers, confirmation paths, reversible actions, auth, tests, and deploy hooks built for real use.

Voice systems

Low-latency orchestration across speech, tools, routing, monitoring, and response generation.

Language systems

Tokenizer behavior, data cleaning, corpus quality, and model training treated as systems problems.

Evaluation pipelines

Reproducible loops for measuring behavior, comparing changes, and making results reviewable.

Workflow infrastructure

Loops for routing, measuring, retrying, keeping, discarding, and leaving enough trace to debug the system later.

Local-first tools

Personal tools that keep useful context close to the machine, the filesystem, and the work.

About

I learn the black box by rebuilding the part that bothers me.

I'm a CS student at UT Dallas who got tired of AI demos that fall apart the second you poke them. So I started rebuilding the boring layers underneath (memory, retrieval, tool execution, evaluation) until the agent behaved less like a party trick and more like something you'd actually hand a real task to.

Most of what I build is the stuff nobody screenshots: retry logic, auth, the test suite that catches an agent doing something confidently wrong at 2am. If it ships, runs, or has a paper attached, it's somewhere in the work above. If it didn't work out, well, I learned more from those anyway.

Contact

Let's build something that survives Monday.

Backend, platform, agent infrastructure, applied AI. If the problem is technical enough to be interesting, I'm in. Internship, full-time, or just a good argument about why your agent keeps hallucinating. Inbox's open.

shifat@shifatsanto.com

Email GitHub LinkedIn Resume

AI agents that behave less like demos.

Systems that remember, route, evaluate, and recover.

Soniq

Kotha-1

smart-mcps

ProfGraph

Research-Agent

Blackbox

Calm surfaces.Serious machinery.

Memory layers

Agent infrastructure

Voice systems

Language systems

Evaluation pipelines

Workflow infrastructure

Local-first tools

I learn the black box by rebuilding the part that bothers me.

Let's build something that survives Monday.

AI agents that
behave less like demos.

Calm surfaces.
Serious machinery.