Raja Babu

Data Scientist
Pune, IN.

About

Highly accomplished Data Scientist with a proven track record of designing, developing, and deploying advanced AI/ML solutions to drive operational efficiency, enhance decision-making, and improve customer experience. Expertise spans large language models (LLMs), natural language processing (NLP), recommendation systems, anomaly detection, and data-driven insights, backed by significant quantitative achievements in reducing time, costs, and improving accuracy.

Work

Digitate - Tata Research Development and Design Centre
|

Data Scientist

Summary

Led the design, development, and deployment of cutting-edge AI/ML and data science solutions, focusing on enhancing enterprise operational efficiency and decision-making through advanced analytical models and natural language processing.

Highlights

Designed and implemented a multi-agent chatbot utilizing LangGraph, enabling CIOs and business users to query SQL, NoSQL, graph databases, and CSVs data via natural language. Automated dynamic query generation and integrated real-time visualizations, improving operational efficiency by 40%.

Applied prompt tuning techniques to enhance LLM response quality and maintain consistency across diverse enterprise queries.

Developed and implemented a knowledge graph-augmented RAG system to improve customer service response generation. Modeled inter-issue and intra-issue relationships from historical support tickets, achieving a 28.6% reduction in median issue resolution time and improving BLEU and MRR scores.

Fine-tuned and quantized the open-source LLaMA 7B model using HuggingFace Transformers and QLoRA for efficient long-context retrieval in domain-specific tasks. Configured for local GPU-based inference, enabling offline deployment for document understanding and contextual response generation.

Developed a context-aware recommendation system to address insight fatigue in IT operations. Reduced insight discovery time by 67% and increased adoption of actionable insights by 25%, significantly improving operational decision-making.

Applied graph-based community detection algorithms to cluster and summarize insights in natural language. Reduced manual synthesis time by 50% and improved insight-driven decision-making by 30%, enhancing clarity and efficiency for IT operations teams.

Designed an analytics-driven solution to extract actionable insights from unstructured ticket descriptions using clustering algorithms and domain-specific knowledge. Achieved 84.37% accuracy for system-generated tickets and 81.63% accuracy for user-generated tickets, reducing manual analysis efforts and enhancing IT issue prioritization.

Designed and implemented an LSTM-based anomaly detection system using PyTorch, reducing false alerts by 45% and achieving a detection accuracy of 97%. Processed thousands of logs per second with real-time detection, enhancing system reliability and operational efficiency. Deployed the solution for production use.

Education

Sinhgad College of Engineering

Bachelor of Engineering

Electronics and Telecommunications

Grade: 9.04/10 GPA

Publications

Addressing Insight Fatigue with Insight Summarization

Published by

Saneet S., Raja Babu, Uday C. Bhookya, M. Natu (Accepted at COMSNETS)

Summary

Research focusing on strategies to combat insight fatigue through effective summarization techniques, accepted for publication at COMSNETS 2025.

Addressing AIOps Insight Fatigue with Insight Chains

Published by

Raja Babu, Uday C. Bhookya, Saneet S., M. Natu (Submitted to ECML PKDD)

Summary

Exploration of AIOps methodologies to manage insight fatigue using interconnected insight chains, submitted to ECML PKDD 2025.

Data-Driven Insight Generation and Creation of Contextually Consistent Chains Thereof

Published by

Raja Babu, Uday C. Bhookya, M. Natu

Summary

Patent application (Application Number: 202421093804, Status: Pending) detailing a data-driven approach for generating and creating contextually consistent insight chains.

Skills

Programming Languages

Python, C++, JavaScript, SQL.

Frameworks & Libraries

Django, FastAPI, Angular, LangChain, LangGraph, HuggingFace Transformers, PEFT.

Machine Learning & Data

PyTorch, TensorFlow, Scikit-learn, Pandas, Numpy, Scipy, OpenCV, YOLO.

DevOps & Tools

Docker, Git, CI/CD, Offline AI Deployment.

Advanced Modelling Techniques

Deep Learning, Supervised & Unsupervised Learning, NLP, Time Series Analysis, LLMs, Generative AI, Recommender Systems, AI-powered Automation.