Pavanta M

Data Engineer
HAL 3rd Stage, 560017, Bangalore, IN.

About

Highly analytical Data Engineer with 1.6 years of experience, specializing in developing and optimizing robust ETL pipelines and cloud-based data platforms using Azure Databricks, Azure Data Factory, Python, and SQL. Proven ability to translate complex business requirements into scalable data solutions, enhance data quality, and automate workflows, contributing to improved operational efficiency and real-time analytics. Adept at leveraging CI/CD frameworks and collaborative problem-solving to deliver high-impact data initiatives in dynamic production environments.

Work

ELANCO
|

Data Engineer

Bangalore, Karnataka, India

Summary

Focused on developing robust ETL solutions, leveraging Azure Databricks, Azure Data Factory, and advanced SQL techniques. Specialized in constructing fact/dimension models, implementing incremental data strategies, and streamlining data workflows to Azure Data Lake Storage (ADLS) while orchestrating CI/CD processes through GitHub.

Highlights

Collaborated with business stakeholders to translate complex requirements into robust data solutions, engineering advanced SQL queries, stored procedures, and views within SSMS to deliver actionable insights.

Architected and implemented scalable data models and ETL frameworks using SQL and PySpark, significantly enhancing data processing and transformation for enterprise-level analytics; ingested SAP data into Azure Data Lake Storage for seamless integration with downstream reporting.

Developed and managed automated data pipelines in Azure Data Factory, streamlining Databricks notebook operations and incremental loading, and created interactive Power BI dashboards to deliver actionable business intelligence.

Maintained robust version control using GitHub repositories, facilitating seamless team collaboration and ensuring operational continuity and data integrity in Databricks environments.

Implemented standardized CI/CD frameworks with GitHub Actions, driving consistent, reliable, and automated deployment methodologies organization-wide.

Juniper Networks
|

Data Engineering Intern

Bangalore, Karnataka, India

Summary

Developed and optimized ETL pipelines using Python and SQL to process network telemetry data, enabling real-time monitoring and analytics. Designed and implemented data validation frameworks, improving data quality for downstream analytics and reporting systems.

Highlights

Developed robust ETL pipelines utilizing Python and SQL to process network telemetry data from routers and switches, enabling real-time monitoring and advanced analytics for network operations teams.

Designed and implemented data validation frameworks, significantly improving the accuracy of network performance metrics and enhancing data quality for downstream analytics and reporting systems.

Collaborated effectively with the data engineering team to implement efficient data ingestion workflows, centralizing log files and performance data into scalable repositories.

Optimized data storage and retrieval processes, resulting in reduced query response times for critical network health dashboards and operational reporting tools.

Automated data processing workflows with Python scripts, reducing manual data handling time by 30% and significantly enhancing reliability across production environments.

Education

UVCE
Bangalore, Karnataka, India

B. Tech

Information Science and Engineering

Grade: CGPA 9.04

Courses

Data Structures

Algorithms

Database Solutions

Software Development Principles

Cloud-based Data Integration

Certificates

Databricks Certified Data Engineer Associate

Issued By

Databricks Academy

Skills

Programming Languages

SQL (Data Transformation, Tuning, Optimization), Python (Data Validation, Scripting), Spark (PySpark), Scala.

Databases

Microsoft SQL Server, Azure SQL Database, Azure Data Lake Storage (ADLS Gen2).

Cloud & Tools

Azure Data Factory (ETL/ELT, Parameterized Pipelines, Scheduling), Azure Databricks (Delta Lake, Notebooks), Azure Synapse Analytics, Power BI.

Version Control & CI/CD

GitHub (Branching, CI/CD Integration), GitHub Actions.

Methodologies

Agile, Software Development Principles, Fact/Dimension Modeling.

Projects

Commercial Data Product - ELANCO

Summary

Designed and implemented an end-to-end automated data platform, integrating Salesforce CRM and Azure Data Lake to support critical business operations.

Talent Acquisition Agent - Generative AI

Summary

Led a cross-functional initiative to design an AI-powered Talent Search Agent, leveraging Large Language Models (LLMs) to automate and enhance talent acquisition processes.