Blog

Technical writing on ML inference, agent AI, system design, and lessons from building at scale.

·16 min read

AI Agents as Universal Task Solvers: When Verification Changes Everything

A new theoretical framework shows how AI agents can become universal problem solvers — but only in domains where solutions can be quickly verified. Here's what that means.

airesearchreasoningscaling-lawstheoryagents
·11 min read

Multi-Agent LLM Systems: What the Research Actually Shows

Synthesizing findings from four recent papers on multi-agent LLM systems. The takeaway: more agents rarely means better results. Distributed systems theory, failure taxonomies, and Amdahl's Law explain why — and point to when multi-agent architectures actually make sense.

ai-agentsllmsdistributed-systemsmulti-agentresearch
·10 min read

Stateful Stream Processing Without the Infrastructure Overhead

Velo is a Python library with a Rust core for processing short-lived stateful streams. It fills the gap between stateless serverless functions and always-on stream engines like Flink — based on the stream functions concept from arXiv:2603.03089.

streamingrustpythonsystemsopen-source
·8 min read

Diagnosing Attention Sinks in Transformer LLMs

An experiment implementing tools to measure attention sinks and activation spikes in transformer models — based on the ICML 2026 paper by Sun, Canziani, and LeCun. Plus a vLLM integration that pins sink tokens in the KV cache.

transformersllmsml-researchexperimentsopen-source
·9 min read

Implementing CHLU: From Paper to Prototype in One Night

A hands-on experiment implementing the Causal Hamiltonian Learning Unit from an ICLR 2026 paper — a physics-inspired neural network layer that uses Hamiltonian dynamics for stable long-horizon prediction. Here's what worked, what broke, and what we learned.

ml-researchphysicsneural-networksexperimentspytorch
·10 min read

Building a Multi-Agent Chatbot with LangGraph and MCP

How I built HiveChat, a multi-LLM agent chatbot using LangGraph's ReAct pattern with 7 Model Context Protocol tools for RAG, code generation, image understanding, and more.

ai-agentsllmslanggraphmcpmulti-agentarchitecture
·10 min read

From PDF to Podcast: Building PodForge

How I built a distributed system that converts PDFs into engaging podcast-style audio using LLMs, real-time WebSocket streaming, and a 4-microservice architecture.

distributed-systemsllmaudioopen-sourcepodforgearchitecture
·8 min read

Teaching an AI Agent to Write Songs: Inside SongSmith

Building an autonomous AI agent that creates complete songs from text descriptions. Learn how we combined LLMs, neural vocoders, and beat generation to create a conversational music composer.

ai-agentsmusic-generationllmsorchestrationarchitecture
·8 min read

Building Vaani — A Programming Language Where You Code in Hindi

I built a complete programming language with Hindi keywords, Devanagari variable names, classes, exception handling, and a Sudoku solver. Here's why and how.

programming-languageshindiopen-sourcevaani
·3 min read

Hello World — Why I'm Starting This Blog

After a decade of building ML systems, agent AI, and creative tools at Amazon — it's time to start writing about it.

metaintroduction