Achitya Pratap Singh achi9629

Hi, I'm Achitya 👋

ML Engineer @ Mercedes-Benz R&D | M.Tech CS, IISc Bengaluru

3.5+ years building production ML — LLM inference systems, fine-tuning pipelines, and edge-optimized models deployed across Mercedes car-lines.

Project	What	Key Result
LLM Inference Engine	From-scratch inference stack in PyTorch: KV cache → paged memory → continuous batching → async serving	8.7× VRAM reduction · 11.1× system throughput · 122 tests
Efficient LLM Fine-Tuning	LoRA/QLoRA adaptation → per-layer sensitivity profiling → selective QAT → AWQ INT4 export	3.5× compression · 2.1× speedup · <0.1pt ROUGE-L loss
TinyStories Transformer	Decoder-only transformer trained from scratch with optimizer & scheduler ablations	Token-based stopping · training stability analysis
NN From Scratch	MLP with backprop implemented from scratch in NumPy — no frameworks	Forward/backward pass · gradient computation · training loop

Head Orientation — Driver Monitoring System (Sep 2022 – Present)
Replaced legacy multi-stage pipeline with end-to-end architecture for real-time head pose estimation.
↓ 2RMSE by 4.6° Yaw / 2.65° Pitch / 1.1° Roll. Met production KPI (< 4.0° 2RMSE).
Deployed across Mercedes car-lines. Currently developing unified multi-task Transformer consolidating head orientation, gaze, and landmarks into a single model.
Face Detection — Driver Monitoring System (Jan – Oct 2025)
Designed lightweight multi-scale detector: 69K params, 0.126 GMACs (16.4× reduction from 2.07 GMACs baseline).
IoU@0.5: 0.9956 vs. 0.93 baseline (+7pt absolute) — no new data collection required.
Adopted as drop-in production replacement by downstream perception teams.

PyTorch Transformers KV Caching Paged Attention Continuous Batching LoRA/QLoRA Quantization (AWQ/GPTQ/QAT) FastAPI Mixed Precision

Production RAG System (chunking, embedding, vector search, reranking, and evaluation pipeline)
Distributed training (FSDP, tensor parallelism, scaling laws)
Triton kernel development (FlashAttention, fused ops)
OSS contributions to vLLM / SGLang