AI Engineer Available for Opportunities

Building Agentic AI Systems for Enterprise

AI Engineer specializing in Large Language Models, Agentic Workflows, and Multi-Agent Systems. I transform complex AI research into production-grade solutions.

About Me

I am an AI Engineer who architects Agentic AI systems that bridge cutting-edge research and production reality. Currently engineering solutions at Agmatel (Elite NVIDIA Partner), previously built Shakti LLM from scratch at SandLogic.

4+
Prod Systems Deployed
100+
Active Users Served
2.5B
Parameters Trained
AIAI
2025 Springer

Experience as AI Engineer

July 2025 - Present

AI Engineer

Agmatel India Pvt Ltd (Elite NVIDIA Partner)
  • • Architecting production-grade GenAI applications using full NVIDIA AI stack (NIMs, Riva, NeMo, TensorRT-LLM)
  • • Built streaming audio-to-RAG pipeline for defense and emergency response use cases
  • • Deployed air-gapped multi-LLM infrastructure on Kubernetes, supporting 100+ concurrent users with defense-lab security compliance
July 2023 - July 2025

AI Developer

SandLogic Technologies Pvt. Ltd
  • • Architected and pre-trained Shakti LLM (2.5B parameters) from scratch for efficient inference
  • • Published research at AIAI 2025 (Springer) on building optimized language models
  • • Engineered LINGO product AI features, driving 70% increase in client adoption through enhanced NLP capabilities
  • • Awards: Star of the Quarter | Certificate of Excellence
Aug 2019 - May 2023

B.Tech ECE

KLE Technological University, Hubli
Electronics & Communication Engineering
CGPA: 9.01/10

Featured Projects by AI Engineer

Production-grade systems with complete technology stack

📻
Defense AI | RAG Systems

Streaming Audio RAG System

Real-time radio transcription with temporal vector search. Processes live audio streams using NVIDIA Riva ASR, stores embeddings in vector DB, enables natural language querying across time periods for defense intelligence.

NVIDIA Riva NeMo Retriever RAG ASR/TTS FAISS Python Context Window Optimization

AI Engineer Skills Applied

Speech AI • LLM Evaluation • Transformer Architectures • Vector DBs • Prompt Engineering

🎙️
Real-time AI | Agentic Systems

Ultra Low Latency Voice Agent

Real-time conversational voice agent powered by NVIDIA Riva, NVIDIA NIM, and Pipecat. Sub-second latency inference with tool-calling capabilities for automotive test-drive scheduling and customer service automation.

NVIDIA NIM Pipecat Tool-calling Agentic Workflows ASR/TTS FastAPI

AI Engineer Skills Applied

AI Agents • LLM Model Serving • Speech AI • PEFT • Prompt Engineering

👁️
Vision AI | Multi-modal Systems

Multi-modal RAG System

Retrieves and integrates information from both text and visual sources. Responds with context-aware answers and relevant reference images from knowledge base using CLIP embeddings and vision encoders.

CLIP ChromaDB Multi-modal OCR LLM Evaluation Few-shot Learning

AI Engineer Skills Applied

Large Language Models • Vector DBs • Transformer Architectures • Hugging Face • Python

📄
Document AI | Enterprise Systems

NeMo Retriever + Vision OCR

Production-ready document intelligence for enterprise and government. Combines Vision OCR with NVIDIA NeMo Retriever for structured data extraction from PDFs, images, and scanned documents with high accuracy.

NeMo OCR NVIDIA AI Stack Qdrant RAG Milvus

AI Engineer Skills Applied

Context Window Optimization • LLM Fine-tuning • PEFT • LoRA • Retrieval-Augmented Generation

🔬
Agentic AI | Multi-Agent Systems

Deep Research System

Structured deep research using CrewAI and NVIDIA Nemotron-3-Nano. Autonomous multi-agent collaboration where specialized AI agents conduct web research, analyze findings, and generate comprehensive reports with citations.

CrewAI Nemotron Multi-Agent Systems Agentic Workflows Tool-calling LangChain

AI Engineer Skills Applied

AI Agents • LLM Evaluation • Prompt Engineering • OpenAI API • Context Window Optimization

LLM Training | Model Optimization

Shakti SLM 2.5B

2.5B parameter Small Language Model optimized for efficient inference. Custom decoder architecture with PEFT, LoRA, QLoRA fine-tuning and model compression techniques. Published research at AIAI 2025 (Springer).

PyTorch LoRA QLoRA PEFT Hugging Face Unsloth AI

AI Engineer Skills Applied

LLM Fine-tuning • Transformer Architectures • LLM Evaluation • Python • LLM Model Serving

Technical Expertise

Complete AI Engineer skill stack

🧠 GenAI & LLMs

Large Language Models Retrieval-Augmented Generation (RAG) LLM Fine-tuning PEFT LoRA QLoRA Prompt Engineering AI Agents Agentic Workflows Multi-Agent Systems Tool-calling LLM Evaluation Transformer Architectures Few-shot Learning Context Window Optimization OCR Speech AI (ASR/TTS) MCP

⚙️ Frameworks & Libraries

PyTorch LangChain LangGraph LlamaIndex CrewAI Hugging Face Transformers NVIDIA AI Stack (NIMs) NVIDIA Riva NeMo TensorRT-LLM vLLM SGLang Ollama Unsloth AI Pipecat RASA FastAPI Streamlit Scikit-learn OpenAI API

🚀 Programming & MLOps

Python Docker Kubernetes AWS EC2 AWS SageMaker ChromaDB FAISS Qdrant Milvus LLM Model Serving Git CI/CD Linux REST APIs Microservices

AI Engineer Competency Matrix

🎯
Model Training
Pre-training, Fine-tuning, PEFT, RLHF
🔍
RAG Systems
Vector DBs, Embeddings, Retrieval
🤖
Agentic AI
Multi-Agent, Workflows, Tool Use
Inference Opt
TensorRT, vLLM, Quantization

Let's Build Agentic Systems Together

AI Engineer available for consulting, freelance projects, and full-time opportunities in LLM/GenAI space.

Preferred Engagement Types

LLM Consulting Agentic AI Architecture RAG Implementation Full-time AI Engineer Roles