Sahil Malik

Sr. Software Development Engineer

Summary

Sr. Software Engineer with ~10 years at Amazon across 7 teams. Founding member of an ML inference platform serving all of Alexa's voice traffic with 30+ models. Co-inventor of US Patent 12,494,194 B1 for asynchronous ML inference architecture. Currently building agent AI systems and LLM-based products. Creator of vaani (Hindi programming language) and a 10+ service AI research lab on NVIDIA DGX hardware.

Technical Skills

ML & AI

ML Inference PlatformsLLM Systems & OptimizationAgent AI / RAGNLU / NLPModel Serving & DeploymentMulti-Agent OrchestrationLangChain / LangGraph

Languages

PythonTypeScriptRustJavaC++Bash

Infrastructure

AWS (ECS, CDK, Lambda, S3, DynamoDB)Docker / KubernetesNVIDIA DGX / CUDACI/CD & Infrastructure as CodeRedis / PostgreSQL / Vector DBs

System Design

High-Availability Service ArchitectureMicroservices DecompositionHigh-Throughput SystemsObservability (OpenTelemetry / Jaeger)Cross-Team Technical Alignment

Patent

US Patent 12,494,194 B1

“Machine learning model architecture for incremental asynchronous inference” — Amazon Technologies Inc, granted Dec 2025. 20 claims, active until 2044.

Experience

Sr. Software Development Engineer

2025–Present · Seattle, WA

Agent AI systems, LLM latency optimization, and developer tooling.

  • Significant latency reduction in LLM-based chat through parallelization and caching
  • Driving migration to modern agent frameworks
  • Built developer tooling for AI-assisted debugging with MCP integration

Sr. Software Development Engineer — Alexa+

2024–2025 · Seattle, WA

LLM service architecture, agent orchestration, and cross-org technical leadership.

  • Demonstrated sub-2-second Alexa response latency in cross-team demos
  • Led technical support team for Alexa+ public launch event
  • Authored monolith-to-microservices architecture proposal reviewed by senior leadership
  • Drove cross-organization alignment across 6+ teams

Sr. Software Development Engineer

2023 · Seattle, WA

Self-learning arbitration systems and containerized microservices.

  • Built self-learning arbitration systems
  • Subject matter expert for containerized microservices and cloud infrastructure

Software Development Engineer II — Alexa AI

2019–2023 · Seattle, WA

Founding member of an ML inference platform team. Grew the platform from serving a small fraction to 100% of Alexa voice traffic.

  • Delivered a production Tier-1 ML inference service hosting 30+ models
  • Built an NLU disambiguation feature serving millions of customers monthly
  • Expanded ML-based routing to 11 locales across 8 languages
  • Co-invented US Patent 12,494,194 B1 for ML inference architecture

Software Development Engineer II — Payments

2018–2019 · Hyderabad, India

Financial products platform development.

  • End-to-end owner of a customer differentiation module
  • Prototyped NoSQL migration for millions of customer records

Software Development Engineer I — Gift Cards

2016–2018 · Hyderabad, India

Gift card services, security features, and API development.

  • Built refund processing system impacting tens of thousands of customers
  • Designed and implemented security features for the gift card claim flow
  • Developed 5 APIs with >95% test coverage

Software Development Engineer Intern

2015 · Hyderabad, India

Summer internship, converted to full-time offer.

Education

B.Tech Computer Engineering

NIT Kurukshetra · 2013–2017 · CGPA: 8.99/10.0

For a PDF version, use your browser's print function (Ctrl/Cmd + P).