ML Engineer building production-grade AI systems with safety at the core. Currently researching Multi-Agent RL for cybersecurity at the University of Arizona and co-authoring StepShield โ a safety benchmark for autonomous code agents (targeting ICML 2027). Previously built recommendation engines at Escape LLC (30% engagement lift) and agentic RAG chatbots at Omdena (95% reduction in harmful responses).
I don't treat AI safety as a checkbox โ I treat it as an engineering discipline.
|
First benchmark for evaluating when autonomous code agents go rogue โ not just whether they do. Detects specification violations (data exfiltration, unauthorized access) in real-time across 9,213 agent trajectories. Early detection cuts monitoring costs by 75% (~$108M projected savings).
|
|
Production-grade LLM evaluation + red-teaming Hybrid n8n + FastAPI architecture with 4 LLM providers, LLM-as-Judge scoring, circuit breaker, DLQ, Redis caching, Prometheus/Grafana monitoring.
|
ML-infra-aware defense for model weights Protects against model-weight exfiltration using a 3-layer cascaded architecture (Rules โ ML โ LLM). Kubernetes-native, GPU-aware anomaly detection.
|
|
7-benchmark bias evaluation + guardrails Open-source LLM bias evaluation framework with red-teaming, guardrails, and monitoring โ all running locally via Ollama. Zero API costs.
|
Production-grade ML pricing system XGBoost demand forecasting + price elasticity estimation + scipy revenue optimization. FastAPI serving, Streamlit dashboard, MLflow tracking, Evidently drift monitoring.
|
|
Full-stack speech pipeline: STT โ LLM โ TTS End-to-end voice assistant running entirely on your own machine โ FastAPI backend, React frontend, Docker. Private by design: zero cloud calls.
|
AI motorcycle advisor for Indian riders RAG over motorcycle specs with vLLM serving, Qdrant vector store, FastAPI. Personalized bike recommendations with source citations.
|
- chatbot-auditor โ Quality auditor for AI chatbots; analyzes conversation logs to surface where bots underperform.
- credit-scoring-fairness-mlops โ End-to-end MLOps with automated fairness gates, drift monitoring, EU AI Act compliance (XGBoost, Fairlearn, MLflow).
- healthcare-bias-audit โ Bias audit of healthcare ML on the MEPS dataset; AIF360 mitigation, SHAP/LIME explainability.
- AI-Chief โ Food science assistant with multi-agent RAG, real-time safety monitoring, dangerous-advice detection (TypeScript, Fastify, HNSW).
- Interactive-Multilingual-AI-Audiobook-Assistant โ OCR extraction โ neural TTS โ multilingual translation โ real-time Q&A audiobook pipeline.
- AI-Wildlife-Tracker โ RAG identifying 500+ Indian wildlife species from text or photos; hybrid retrieval, ONNX inference, Langfuse observability.
- Multilingual-Sentiment-Emotion-Intelligence-Engine โ 5 languages + Hindi-English code-switching; multi-task XLM-RoBERTa with LoRA adapters, ONNX INT8.
- Algorithmic-Trading-AI โ FinBERT sentiment + spaCy NER + TimeGPT forecasting โ BUY/SELL/HOLD signals from real-time financial news.
- LLaMA-Sum-Fine-Tuning โ LLaMA 3.2 1B fine-tuned via QLoRA; 40%+ ROUGE-2 improvement over base on CNN/DailyMail.
| ๐ป Languages | |
| ๐ค ML / DL | |
| ๐ง LLM & Agents | |
| ๐ ๏ธ MLOps / Cloud | |
| ๐ Observability | |
| ๐ก๏ธ AI Safety & Responsible AI | |
| ๐๏ธ Data |
Open to ML Engineer, AI Safety, and AI Researcher roles
Let's build AI systems that are powerful AND trustworthy.




