Dipanwita Das

M.Sc. Data Science Student | Analytics & Forecasting Specialist
Kolkata, IN.

About

Highly analytical M.Sc. Data Science student with proven expertise in regression, forecasting, and large-scale survey analysis. Adept at leveraging Python, R, and SQL to develop robust ETL pipelines, build predictive models, and generate data-driven insights. Seeking an impactful role in analytics or forecasting to apply advanced statistical modeling and machine learning techniques to solve complex business challenges.

Work

SoftOfficePro
|

Analytics Intern

Kolkata, West Bengal, India

Summary

Spearheaded survey analytics and forecasting initiatives, developing robust ETL pipelines and predictive models to optimize data processing and strategic planning.

Highlights

Engineered a Python-based ETL pipeline to process over 120K ODK survey records, automating multi-select variable handling and generating labeled datasets, which reduced manual effort by 57%.

Developed SPSS-compatible outputs and automated reports, including a completeness dashboard, to monitor data quality and streamline reporting processes.

Designed and implemented regression models to accurately forecast respondent turnout, providing critical insights to guide targeted outreach planning.

Contributed significantly to a large-scale census project by developing and deploying an end-to-end ETL pipeline for cleaning and processing field data, automatically flagging duplicates, and assigning records for quality assurance, enhancing data integrity.

Created Excel scripting logic for ODK XLSForms, incorporating advanced skip logic, validations, and calculations to improve data collection efficiency and accuracy.

Education

Christ University, Pune Lavasa Campus
Pune, Maharashtra, India

M.Sc

Data Science

Grade: 8.44/10

University of Calcutta
Kolkata, West Bengal, India

B.Sc (Hons.)

Statistics

Grade: 7.856/10

Skills

Statistical Analysis & Modeling

Regression, Hypothesis Testing, Forecasting, Exploratory Data Analysis (EDA), Residual Diagnostics.

Machine Learning

Linear Regression, Logistic Regression, Decision Trees, Random Forest, SARIMA, Holt-Winters, LSTM.

Programming Languages

Python, R, SQL.

Data Visualization

Matplotlib, Seaborn, Power BI.

Soft Skills

Analytical Thinking, Problem Solving.

Projects

Forecasting U.S. Energy Production (Time Series Modeling)

Summary

Developed and evaluated advanced time series models to predict U.S. energy production, focusing on accuracy and model tuning.

SDG Goal 4: Power BI Dashboard for Education Insights

Summary

Designed and implemented an interactive Power BI dashboard to visualize global education metrics, enabling data-driven insights.

Retail Sales Prediction (Regression)

Summary

Developed a regression model to predict retail store sales based on various influencing factors, achieving high accuracy.