About
Highly analytical Data Engineer with 1.6 years of experience, specializing in developing and optimizing robust ETL pipelines and cloud-based data platforms using Azure Databricks, Azure Data Factory, Python, and SQL. Proven ability to translate complex business requirements into scalable data solutions, enhance data quality, and automate workflows, contributing to improved operational efficiency and real-time analytics. Adept at leveraging CI/CD frameworks and collaborative problem-solving to deliver high-impact data initiatives in dynamic production environments.
Work
Bangalore, Karnataka, India
→
Summary
Focused on developing robust ETL solutions, leveraging Azure Databricks, Azure Data Factory, and advanced SQL techniques. Specialized in constructing fact/dimension models, implementing incremental data strategies, and streamlining data workflows to Azure Data Lake Storage (ADLS) while orchestrating CI/CD processes through GitHub.
Highlights
Collaborated with business stakeholders to translate complex requirements into robust data solutions, engineering advanced SQL queries, stored procedures, and views within SSMS to deliver actionable insights.
Architected and implemented scalable data models and ETL frameworks using SQL and PySpark, significantly enhancing data processing and transformation for enterprise-level analytics; ingested SAP data into Azure Data Lake Storage for seamless integration with downstream reporting.
Developed and managed automated data pipelines in Azure Data Factory, streamlining Databricks notebook operations and incremental loading, and created interactive Power BI dashboards to deliver actionable business intelligence.
Maintained robust version control using GitHub repositories, facilitating seamless team collaboration and ensuring operational continuity and data integrity in Databricks environments.
Implemented standardized CI/CD frameworks with GitHub Actions, driving consistent, reliable, and automated deployment methodologies organization-wide.
Bangalore, Karnataka, India
→
Summary
Developed and optimized ETL pipelines using Python and SQL to process network telemetry data, enabling real-time monitoring and analytics. Designed and implemented data validation frameworks, improving data quality for downstream analytics and reporting systems.
Highlights
Developed robust ETL pipelines utilizing Python and SQL to process network telemetry data from routers and switches, enabling real-time monitoring and advanced analytics for network operations teams.
Designed and implemented data validation frameworks, significantly improving the accuracy of network performance metrics and enhancing data quality for downstream analytics and reporting systems.
Collaborated effectively with the data engineering team to implement efficient data ingestion workflows, centralizing log files and performance data into scalable repositories.
Optimized data storage and retrieval processes, resulting in reduced query response times for critical network health dashboards and operational reporting tools.
Automated data processing workflows with Python scripts, reducing manual data handling time by 30% and significantly enhancing reliability across production environments.
Skills
Programming Languages
SQL (Data Transformation, Tuning, Optimization), Python (Data Validation, Scripting), Spark (PySpark), Scala.
Databases
Microsoft SQL Server, Azure SQL Database, Azure Data Lake Storage (ADLS Gen2).
Cloud & Tools
Azure Data Factory (ETL/ELT, Parameterized Pipelines, Scheduling), Azure Databricks (Delta Lake, Notebooks), Azure Synapse Analytics, Power BI.
Version Control & CI/CD
GitHub (Branching, CI/CD Integration), GitHub Actions.
Methodologies
Agile, Software Development Principles, Fact/Dimension Modeling.