Jahnavi Poloju

Data Engineer - AI/ML | LLM & RAG Specialist
Remote, US.

About

Highly accomplished Data Engineer with a strong focus on AI/ML, bringing expertise in developing and deploying advanced data pipelines and intelligent systems. Proven ability to automate complex ETL processes, enhance data completeness by 30%, and leverage LLMs and NLP for sentiment analysis and data enrichment, driving significant improvements in business intelligence and operational efficiency. Seeking to apply cutting-edge AI/ML and data engineering skills to build scalable, high-impact solutions in a forward-thinking technology environment.

Work

Claris, an Apple company - contracted through TechnoApex.ltd
|

Data Engineer - AI/ML

Remote, N/A, US

Summary

Led the development and deployment of AI/ML-driven data pipelines and automation solutions to enhance marketing intelligence and customer insights for Claris, an Apple company.

Highlights

Automated cross-channel data ETL processes, replacing manual reporting and reducing Business Intelligence (BI) workload by 70% for the Marketing team.

Engineered and deployed robust ETL pipelines in Python with Airflow, integrating data from diverse RESTful APIs and building interactive Grafana dashboards to enhance campaign visibility.

Developed an AI data enrichment tool using Google API and Crawl4AI, achieving a 30% increase in CRM data completeness for improved segmentation and analytics workflows.

Designed structured prompts for precise entity extraction, minimizing hallucination, and implemented async processing with Pydantic for high-quality, field-level data use.

Developed and deployed AI-driven data pipelines for sentiment and topic modeling of community data, utilizing advanced NLP techniques (LDA, BERTopic) and OpenAI APIs to enable real-time tracking of customer concerns.

Education

Indian Institute of Technology Kharagpur (IITKGP)
Kharagpur, West Bengal, India

Bachelor of Technology

Civil Engineering

Grade: 7.45/10

Languages

English

Skills

Programming Languages

Python, SQL, JavaScript.

AI/ML & Data Science

LangChain, scikit-learn, Tensorflow, LDA (Gensim), BERTopic (UMAP, HDBSCAN, c-TF-IDF), Physics-Informed Neural Networks (PINNs), LLMs, RAG, NLP, Pydantic, MLflow.

Data Engineering

AirFlow, ETL, Data Pipelines, API Integration, Web Scraping, Vector Stores (FAISS, ChromaDB), Async Processing.

Frameworks & Libraries

FastAPI, RESTful APIs, Flask, Streamlit, PyMuPDF.

Developer Tools & Cloud

Git, Azure, VS Code.

Business Intelligence

Tableau, Grafana.

Projects

Multi-Agent Research Assistant for Annual Report Analysis

Summary

Developed a sophisticated RAG-powered system leveraging LLMs and multi-agent collaboration to parse, summarize, and analyze complex company financial reports.

Conversational Web Search Tool

Summary

Designed and deployed a RAG-based assistant leveraging OpenAI and Serper APIs to retrieve web data and generate natural language responses for enhanced user interaction.

Undergraduate Research - Deep Learning

Summary

Conducted research on physics-informed neural networks (PINNs) using TensorFlow to model and analyze structural elements, advancing capabilities in computational mechanics.