Rigorous Work with Fallible AI
A practical essay on building production systems when the model is useful, uneven, and impossible to treat like normal software.
Land
At Airbnb, I build evaluation frameworks and tools that help engineers, legal, policy, and support teams work together on high-risk AI systems. Risk taxonomies, technical review for high-stakes launches, tools that help non-engineers engage with ML development. I focus on encoding scientific rigor into workflows so teams can move faster without hand-waving. I also run training programs for AI coding tools.
The past year I've been deep in AI-assisted development: at work, after work, and during the hours I should be sleeping. Career-wise I've always been closer to algorithms and math than product development, bouncing between engineer and data scientist titles since my analog circuit design days. Being able to test ideas and build them into reality without the parts I never wanted to learn (looking at you, TypeScript) has been honestly addictive.
I push limits in personal projects, then bring the useful patterns back into a much more rigorous work environment. The change has been a dream come true: an exponential lowering of the barriers to building whatever I can imagine. Before 2025 I'd never built a website. Now I can ship a full agentic AI product to a client on the side, just for fun.
A practical essay on building production systems when the model is useful, uneven, and impossible to treat like normal software.
A support platform where AI drafts, experts refine, and every approved answer becomes better structured institutional memory.
GPS analysis for foil sessions: VMG, upwind angles, wind estimation, and gear comparisons grounded in data instead of vibes.
Structured journaling as an operating system for memory: daily notes, weekly reviews, decisions, and pattern recognition over time.