About
Highly analytical and results-driven Computer Science Master's candidate with 3+ years of experience in data engineering and analytics, specializing in cloud platforms (AWS, Azure, GCP), Python, SQL, and advanced BI tools. Proven ability to optimize ETL pipelines, migrate large-scale data, and develop impactful dashboards, consistently driving efficiency gains, cost reductions, and improved business decision-making.
Work
Brentwood, TN, USA
→
Summary
Leveraged sales and customer data to optimize marketing strategies, improve inventory accuracy, and enhance customer experience through advanced analytics and dashboard creation.
Highlights
Analyzed sales and customer data from 2000+ stores using AWS Redshift and Power BI, uncovering trends that boosted promotional impact by 15%.
Automated weekly ETL pipelines with Airflow, cutting manual reporting by 60% and ensuring timely dashboard updates.
Developed demand forecasting models using Python (scikit-learn) and MLflow, improving inventory accuracy by 22%.
Created Tableau and AWS QuickSight dashboards for churn and sales KPIs, optimizing marketing expenses and boosting ROI by 10%.
Delivered real-time KPI reporting by integrating AWS S3, Airflow, and Power BI, providing leadership with on-demand metrics.
Norfolk, VA, USA
→
Summary
Drove significant improvements in student retention and operational efficiency through advanced data analysis, ETL pipeline automation, and KPI dashboard development for university systems.
Highlights
Boosted student retention by 12% through in-depth analysis of enrollment, graduation trends, and budget allocations, leveraging Python, SQL, and PySpark on Databricks.
Reduced reporting time by 40% by designing and implementing interactive Power BI and Cognos dashboards for critical academic, demographic, and budget KPIs.
Automated ETL pipelines with Apache Airflow and SQL Server, decreasing manual effort by 60% and streamlining data refreshes for enhanced accuracy.
Accelerated decision-making speed by 25% by consolidating siloed student, HR, and budget data into unified, centralized reporting layers.
Norfolk, VA, USA
→
Summary
Instructed and mentored over 40 students in complex big data analytics frameworks, fostering practical skills and enhancing academic performance.
Highlights
Led weekly Spark and Hadoop labs for 40+ students, significantly strengthening hands-on understanding of distributed data frameworks.
Mentored students on debugging and tuning PySpark and SQL code, directly increasing project success rates and academic performance.
Bangalore, KA, IND
→
Summary
Spearheaded large-scale data migration, ETL pipeline design, and data model development to support unified analytics and reporting across diverse regions for healthcare and financial data.
Highlights
Spearheaded the migration of 30+ TB of healthcare and financial data from on-premises systems to AWS S3 and Redshift, reducing storage costs by 35% and improving query performance by 2x.
Designed and orchestrated robust ETL pipelines using AWS Glue, ensuring 99.9% pipeline reliability and SLA adherence.
Built and maintained star/snowflake data models in BigQuery and Azure Synapse, enabling unified analytics and decreasing data retrieval times by 40%.
Integrated and transformed raw EHR, billing, and claims data using PySpark and SQL, enhancing data quality and enabling HIPAA-compliant reporting across 12+ regions.
Automated deployment and scaling of analytics workflows using Kubernetes, enhancing reliability and reducing infrastructure management overhead by 40% across multi-cloud environments.
Education
Languages
English
Skills
Programming & Data Analysis
Python (Pandas, NumPy, Scikit-learn), SQL, Excel (VLOOKUP, PivotTables, XLOOKUP), R, A/B Testing, Regression Analysis, Forecasting, EDA, KPI Tracking.
Data Engineering & Pipelines
Apache Airflow, AWS Glue, AWS S3, AWS Redshift, Databricks, MLflow, ETL Development, Data Cleaning & Validation.
Business Intelligence & Visualization
Power BI, Tableau, AWS QuickSight, Looker Studio, Qlik, Alteryx.
Data Modeling & Warehousing
Star/Snowflake Schemas, Redshift, Data Lakehouse, SQL-Based Modeling.
Tools & DevOps
Git, Jupyter Notebooks, MLflow, Docker (basic), Kubernetes, Confluence, Jira.
Cloud Platforms
AWS (S3, Glue, Redshift, Lambda), Azure (Synapse, Data Factory, SQL), GCP (BigQuery).