AI Engineer specializing in Large Language Models, Agentic Workflows, and Multi-Agent Systems. I transform complex AI research into production-grade solutions.
I am an AI Engineer who architects Agentic AI systems that bridge cutting-edge research and production reality. Currently engineering solutions at Agmatel (Elite NVIDIA Partner), previously built Shakti LLM from scratch at SandLogic.
Production-grade systems with complete technology stack
Real-time radio transcription with temporal vector search. Processes live audio streams using NVIDIA Riva ASR, stores embeddings in vector DB, enables natural language querying across time periods for defense intelligence.
AI Engineer Skills Applied
Speech AI • LLM Evaluation • Transformer Architectures • Vector DBs • Prompt Engineering
Real-time conversational voice agent powered by NVIDIA Riva, NVIDIA NIM, and Pipecat. Sub-second latency inference with tool-calling capabilities for automotive test-drive scheduling and customer service automation.
AI Engineer Skills Applied
AI Agents • LLM Model Serving • Speech AI • PEFT • Prompt Engineering
Retrieves and integrates information from both text and visual sources. Responds with context-aware answers and relevant reference images from knowledge base using CLIP embeddings and vision encoders.
AI Engineer Skills Applied
Large Language Models • Vector DBs • Transformer Architectures • Hugging Face • Python
Production-ready document intelligence for enterprise and government. Combines Vision OCR with NVIDIA NeMo Retriever for structured data extraction from PDFs, images, and scanned documents with high accuracy.
AI Engineer Skills Applied
Context Window Optimization • LLM Fine-tuning • PEFT • LoRA • Retrieval-Augmented Generation
Structured deep research using CrewAI and NVIDIA Nemotron-3-Nano. Autonomous multi-agent collaboration where specialized AI agents conduct web research, analyze findings, and generate comprehensive reports with citations.
AI Engineer Skills Applied
AI Agents • LLM Evaluation • Prompt Engineering • OpenAI API • Context Window Optimization
2.5B parameter Small Language Model optimized for efficient inference. Custom decoder architecture with PEFT, LoRA, QLoRA fine-tuning and model compression techniques. Published research at AIAI 2025 (Springer).
AI Engineer Skills Applied
LLM Fine-tuning • Transformer Architectures • LLM Evaluation • Python • LLM Model Serving
Complete AI Engineer skill stack
AI Engineer available for consulting, freelance projects, and full-time opportunities in LLM/GenAI space.
Preferred Engagement Types