Santhosh G

Entry-Level Data Scientist | Aspiring Backend/ML Engineer
Bangalore, IN.

About

Data Scientist with robust ML and BI expertise, specializing in Python, XGBoost, SQL, and Power BI. Proven ability to develop and deploy production-ready Streamlit dashboards and analytics, improving prediction accuracy to 91% and generating actionable insights from 103K+ trip records for operational planning. Seeking a Backend/ML Engineering role to leverage strong analytical, modeling, and deployment skills to productionize models, automate ETL processes, and scale data solutions.

Work

Datamites
|

Data Scientist Trainee

Bangalore, Karnataka, India

Summary

Led the development and operationalization of an end-to-end XGBoost forecasting pipeline, encompassing comprehensive data analysis, statistical modeling, and advanced feature engineering to deliver robust predictive solutions.

Highlights

Developed an XGBoost forecasting pipeline, achieving high predictive accuracy (R²=0.9088, RMSE=604.64) by leveraging advanced statistical modeling and temporal/weather feature engineering.

Implemented rigorous time-series cross-validation, leakage checks, and hyperparameter tuning to optimize model performance and ensure generalization across diverse datasets.

Operationalized model deployment by integrating forecasts via Streamlit, enabling stakeholders with real-time access and interactive scenario simulations for informed decision-making.

Ensured model reliability and maintainability through the implementation of input validation, runbook procedures, and versioned experiment notebooks in Git, enhancing reproducibility and supporting business decisioning.

Conducted comprehensive Exploratory Data Analysis (EDA) to uncover critical data patterns and inform feature selection, improving the foundation for predictive model development.

Skill Course
|

PowerBI Trainee

Bangalore, Karnataka, India

Summary

Modeled, cleaned, and analyzed over 103,700 Uber trip records, developing comprehensive Power BI dashboards to surface critical demand patterns and operational insights.

Highlights

Processed and cleaned 103,700+ Uber trip records, integrating Calendar and Location tables to create a robust data model for in-depth analysis.

Developed interactive Power BI dashboards utilizing custom DAX measures, KPI cards, slicers, and drill-throughs to visualize key metrics like $1.6M in revenue, $15 average fare, and 3-mile average trip distance/16 minutes.

Surfaced crucial demand patterns, including weekend peaks (18.7K on Saturday, 19.2K on Sunday) and lowest volume (9.3K on Friday), along with a 73%/27% day/night split, to inform scheduling and zone prioritization.

Automated data refresh processes and implemented data quality checks with documented schema and incremental refresh logic, ensuring data integrity and timeliness.

Created parameterized queries, DAX documentation, and user guides to empower self-service analytics, significantly reducing ad-hoc reporting requests and improving data accessibility.

Education

GM Institute of Technology
Davanagere, Karnataka, India

Bachelor's

Computer Science and Engineering

Grade: 7.28

Languages

English

Certificates

NASSCOM Certified Data Scientist (Gold)

Issued By

FutureSkills Prime / NASSCOM

SQL Micro Course (30 Days)

Issued By

Skill Course

Power BI Micro Course (30 Days)

Issued By

Skill Course

Skills

Languages

Python, SQL, DAX.

ML/Modeling

XGBoost, scikit-learn, Regression, Model Evaluation (R2, RMSE), Hyperparameter Tuning, Cross-validation, Time-series Analysis, Predictive Modeling.

Data Visualization & BI

Power BI, Streamlit, Excel (Pivot, VBA), KPI Dashboards, Bookmarks, Slicers, Drill-throughs.

Engineering & Tools

Git, Jupyter, ETL, Data Pipelines, Data Quality.

Data Analysis

EDA, Statistical Modeling, Feature Engineering, Data Cleaning, Data Modeling, Problem Solving, Operational Planning.

Projects

Bike Rental Demand Prediction

Summary

Developed and deployed a robust machine learning solution for bike rental demand forecasting, leveraging advanced XGBoost modeling and comprehensive feature engineering to enhance operational planning and decision-making.

Uber Trip Analysis Dashboard

Summary

Designed and implemented a comprehensive Power BI dashboard for Uber trip analysis, processing over 103,700 records to extract critical KPIs and uncover key demand patterns for operational optimization.