What I'm working on
At Airbnb, I build evaluation frameworks and tools that help engineers, legal, policy, and support teams work together on high-risk AI systems. LLM risk taxonomies, technical review for high-stakes launches, tools that help non-engineers engage with ML development. I focus on encoding scientific rigor into workflows so teams can move faster without hand-waving. I also run training programs for AI coding tools.
The past year I've been deep in AI-assisted development - at work, after work, and during the hours I should be sleeping. Career-wise I've always been closer to algorithms and math than product development, bouncing between engineer and data scientist titles since my analog circuit design days. Being able to test ideas and build them into reality without the parts I never wanted to learn (looking at you, TypeScript) has been honestly addictive.
I push limits in personal projects, then apply successful patterns to the much more rigorous work environment. The change has been a dream come true: an exponential lowering of the barriers to building whatever I can imagine. Before 2025 I'd never built a website. Now I can ship a full agentic AI product to a client on the side, just for fun. Below are a few of the projects.
Rigorous Work with Fallible AI
On building production systems when your smartest tool is also your most unpredictable.
Support system case study
Building - and continuously rebuilding - a customer support platform with Claude Code and the Agent SDK as both kept getting better.
Foil Lab
Analyzes GPS tracks from foiling sessions to quantify performance across different gear and conditions. VMG, upwind angles, wind estimation. Data, not just feel.