Rakshit Aralimatti | AI Engineer & Agentic Systems Specialist

Experience as AI Engineer

July 2025 - Present

AI Engineer

Agmatel India Pvt Ltd (Elite NVIDIA Partner)

• Architecting production-grade GenAI applications using full NVIDIA AI stack (NIMs, Riva, NeMo, TensorRT-LLM)
• Built streaming audio-to-RAG pipeline for defense and emergency response use cases
• Deployed air-gapped multi-LLM infrastructure on Kubernetes, supporting 100+ concurrent users with defense-lab security compliance

July 2023 - July 2025

AI Developer

SandLogic Technologies Pvt. Ltd

• Architected and pre-trained Shakti LLM (2.5B parameters) from scratch for efficient inference
• Published research at AIAI 2025 (Springer) on building optimized language models
• Engineered LINGO product AI features, driving 70% increase in client adoption through enhanced NLP capabilities
• Awards: Star of the Quarter | Certificate of Excellence

Aug 2019 - May 2023

B.Tech ECE

KLE Technological University, Hubli

Electronics & Communication Engineering
CGPA: 9.01/10

Featured Projects by AI Engineer

Production-grade systems with complete technology stack

📻

Defense AI | RAG Systems

Streaming Audio RAG System

Real-time radio transcription with temporal vector search. Processes live audio streams using NVIDIA Riva ASR, stores embeddings in vector DB, enables natural language querying across time periods for defense intelligence.

NVIDIA Riva NeMo Retriever RAG ASR/TTS FAISS Python Context Window Optimization

AI Engineer Skills Applied

Speech AI • LLM Evaluation • Transformer Architectures • Vector DBs • Prompt Engineering

🎙️

Real-time AI | Agentic Systems

Ultra Low Latency Voice Agent

Real-time conversational voice agent powered by NVIDIA Riva, NVIDIA NIM, and Pipecat. Sub-second latency inference with tool-calling capabilities for automotive test-drive scheduling and customer service automation.

NVIDIA NIM Pipecat Tool-calling Agentic Workflows ASR/TTS FastAPI

AI Engineer Skills Applied

AI Agents • LLM Model Serving • Speech AI • PEFT • Prompt Engineering

👁️

Vision AI | Multi-modal Systems

Multi-modal RAG System

Retrieves and integrates information from both text and visual sources. Responds with context-aware answers and relevant reference images from knowledge base using CLIP embeddings and vision encoders.

CLIP ChromaDB Multi-modal OCR LLM Evaluation Few-shot Learning

AI Engineer Skills Applied

Large Language Models • Vector DBs • Transformer Architectures • Hugging Face • Python

📄

Document AI | Enterprise Systems

NeMo Retriever + Vision OCR

Production-ready document intelligence for enterprise and government. Combines Vision OCR with NVIDIA NeMo Retriever for structured data extraction from PDFs, images, and scanned documents with high accuracy.

NeMo OCR NVIDIA AI Stack Qdrant RAG Milvus

AI Engineer Skills Applied

Context Window Optimization • LLM Fine-tuning • PEFT • LoRA • Retrieval-Augmented Generation

🔬

Agentic AI | Multi-Agent Systems

Deep Research System

Structured deep research using CrewAI and NVIDIA Nemotron-3-Nano. Autonomous multi-agent collaboration where specialized AI agents conduct web research, analyze findings, and generate comprehensive reports with citations.

CrewAI Nemotron Multi-Agent Systems Agentic Workflows Tool-calling LangChain

AI Engineer Skills Applied

AI Agents • LLM Evaluation • Prompt Engineering • OpenAI API • Context Window Optimization

⚡

LLM Training | Model Optimization

Shakti SLM 2.5B

2.5B parameter Small Language Model optimized for efficient inference. Custom decoder architecture with PEFT, LoRA, QLoRA fine-tuning and model compression techniques. Published research at AIAI 2025 (Springer).

PyTorch LoRA QLoRA PEFT Hugging Face Unsloth AI

AI Engineer Skills Applied

LLM Fine-tuning • Transformer Architectures • LLM Evaluation • Python • LLM Model Serving

Technical Expertise

Complete AI Engineer skill stack

🧠 GenAI & LLMs

Large Language Models Retrieval-Augmented Generation (RAG) LLM Fine-tuning PEFT LoRA QLoRA Prompt Engineering AI Agents Agentic Workflows Multi-Agent Systems Tool-calling LLM Evaluation Transformer Architectures Few-shot Learning Context Window Optimization OCR Speech AI (ASR/TTS) MCP

⚙️ Frameworks & Libraries

PyTorch LangChain LangGraph LlamaIndex CrewAI Hugging Face Transformers NVIDIA AI Stack (NIMs) NVIDIA Riva NeMo TensorRT-LLM vLLM SGLang Ollama Unsloth AI Pipecat RASA FastAPI Streamlit Scikit-learn OpenAI API

🚀 Programming & MLOps

Python Docker Kubernetes AWS EC2 AWS SageMaker ChromaDB FAISS Qdrant Milvus LLM Model Serving Git CI/CD Linux REST APIs Microservices

AI Engineer Competency Matrix

🎯

Model Training

Pre-training, Fine-tuning, PEFT, RLHF

🔍

RAG Systems

Vector DBs, Embeddings, Retrieval

🤖

Agentic AI

Multi-Agent, Workflows, Tool Use

⚡

Inference Opt

TensorRT, vLLM, Quantization

Building Agentic AI Systems for Enterprise

About Me

Experience as AI Engineer

AI Engineer

AI Developer

B.Tech ECE

Featured Projects by AI Engineer

Streaming Audio RAG System

Ultra Low Latency Voice Agent

Multi-modal RAG System

NeMo Retriever + Vision OCR

Deep Research System

Shakti SLM 2.5B

Technical Expertise

🧠 GenAI & LLMs

⚙️ Frameworks & Libraries

🚀 Programming & MLOps

AI Engineer Competency Matrix

Let's Build Agentic Systems Together