About
Data Scientist with robust ML and BI expertise, specializing in Python, XGBoost, SQL, and Power BI. Proven ability to develop and deploy production-ready Streamlit dashboards and analytics, improving prediction accuracy to 91% and generating actionable insights from 103K+ trip records for operational planning. Seeking a Backend/ML Engineering role to leverage strong analytical, modeling, and deployment skills to productionize models, automate ETL processes, and scale data solutions.
Work
Datamites
|Data Scientist Trainee
Bangalore, Karnataka, India
→
Summary
Led the development and operationalization of an end-to-end XGBoost forecasting pipeline, encompassing comprehensive data analysis, statistical modeling, and advanced feature engineering to deliver robust predictive solutions.
Highlights
Developed an XGBoost forecasting pipeline, achieving high predictive accuracy (R²=0.9088, RMSE=604.64) by leveraging advanced statistical modeling and temporal/weather feature engineering.
Implemented rigorous time-series cross-validation, leakage checks, and hyperparameter tuning to optimize model performance and ensure generalization across diverse datasets.
Operationalized model deployment by integrating forecasts via Streamlit, enabling stakeholders with real-time access and interactive scenario simulations for informed decision-making.
Ensured model reliability and maintainability through the implementation of input validation, runbook procedures, and versioned experiment notebooks in Git, enhancing reproducibility and supporting business decisioning.
Conducted comprehensive Exploratory Data Analysis (EDA) to uncover critical data patterns and inform feature selection, improving the foundation for predictive model development.
Skill Course
|PowerBI Trainee
Bangalore, Karnataka, India
→
Summary
Modeled, cleaned, and analyzed over 103,700 Uber trip records, developing comprehensive Power BI dashboards to surface critical demand patterns and operational insights.
Highlights
Processed and cleaned 103,700+ Uber trip records, integrating Calendar and Location tables to create a robust data model for in-depth analysis.
Developed interactive Power BI dashboards utilizing custom DAX measures, KPI cards, slicers, and drill-throughs to visualize key metrics like $1.6M in revenue, $15 average fare, and 3-mile average trip distance/16 minutes.
Surfaced crucial demand patterns, including weekend peaks (18.7K on Saturday, 19.2K on Sunday) and lowest volume (9.3K on Friday), along with a 73%/27% day/night split, to inform scheduling and zone prioritization.
Automated data refresh processes and implemented data quality checks with documented schema and incremental refresh logic, ensuring data integrity and timeliness.
Created parameterized queries, DAX documentation, and user guides to empower self-service analytics, significantly reducing ad-hoc reporting requests and improving data accessibility.
Education
GM Institute of Technology
→
Bachelor's
Computer Science and Engineering
Grade: 7.28
Languages
English
Certificates
NASSCOM Certified Data Scientist (Gold)
Issued By
FutureSkills Prime / NASSCOM
SQL Micro Course (30 Days)
Issued By
Skill Course
Power BI Micro Course (30 Days)
Issued By
Skill Course
Skills
Languages
Python, SQL, DAX.
ML/Modeling
XGBoost, scikit-learn, Regression, Model Evaluation (R2, RMSE), Hyperparameter Tuning, Cross-validation, Time-series Analysis, Predictive Modeling.
Data Visualization & BI
Power BI, Streamlit, Excel (Pivot, VBA), KPI Dashboards, Bookmarks, Slicers, Drill-throughs.
Engineering & Tools
Git, Jupyter, ETL, Data Pipelines, Data Quality.
Data Analysis
EDA, Statistical Modeling, Feature Engineering, Data Cleaning, Data Modeling, Problem Solving, Operational Planning.