Skip to content

~/shifat santo/systems ai engineer

AI agents that behave less like demos.

// scroll to recompile ↓

I build the unglamorous parts that make AI agents actually work: memory, tool use, evaluation, and a recovery path for when they confidently do something dumb. Fewer demos, more software that survives contact with real users.

Selected Work

Systems that remember, route, evaluate, and recover.

A few systems built around AI: voice, agents, language models, applied cryptography, and the MCP tooling that ties it together. Each one ships, runs, or has a paper attached. 18 projects in total.

See all 18 projects

18 systems · Dallas, TX · 2026

Systems I Build

Calm surfaces.
Serious machinery.

The pattern repeats across the work: context enters, tools act, results get measured, and useful state survives the session.

Memory layers

Local files, notes, screenshots, and decisions become retrievable state instead of repeated context tax.

Agent infrastructure

MCP servers, confirmation paths, reversible actions, auth, tests, and deploy hooks built for real use.

Voice systems

Low-latency orchestration across speech, tools, routing, monitoring, and response generation.

Language systems

Tokenizer behavior, data cleaning, corpus quality, and model training treated as systems problems.

Evaluation pipelines

Reproducible loops for measuring behavior, comparing changes, and making results reviewable.

Workflow infrastructure

Loops for routing, measuring, retrying, keeping, discarding, and leaving enough trace to debug the system later.

Local-first tools

Personal tools that keep useful context close to the machine, the filesystem, and the work.

About

I learn the black box by rebuilding the part that bothers me.

I'm a CS student at UT Dallas who got tired of AI demos that fall apart the second you poke them. So I started rebuilding the boring layers underneath (memory, retrieval, tool execution, evaluation) until the agent behaved less like a party trick and more like something you'd actually hand a real task to.

Most of what I build is the stuff nobody screenshots: retry logic, auth, the test suite that catches an agent doing something confidently wrong at 2am. If it ships, runs, or has a paper attached, it's somewhere in the work above. If it didn't work out, well, I learned more from those anyway.

Contact

Let's build something that survives Monday.

Backend, platform, agent infrastructure, applied AI. If the problem is technical enough to be interesting, I'm in. Internship, full-time, or just a good argument about why your agent keeps hallucinating. Inbox's open.

shifat@shifatsanto.com