Avinash M

Emerging Data Engineer
Chennai, IN.

About

Emerging Data Engineer with 10 months of hands-on experience and an MBA background, specializing in designing and optimizing robust ETL pipelines, API integrations, and Lakehouse architectures using Python, SQL, PySpark, Databricks, and AWS. Proven ability to deliver clean, analytics-ready data and build scalable solutions that enhance efficiency and enable data-driven decision-making, leveraging a strong foundation in business analytics for strategic impact.

Work

Freightify Pvt Ltd
|

Data Integration Analyst - Intern

Chennai, Tamil Nadu, India

Summary

Orchestrated end-to-end ETL pipelines and designed scalable Lakehouse architectures, driving data-driven decisions and operational efficiency for a logistics tech company.

Highlights

Developed and orchestrated 4+ end-to-end ETL pipelines using Python and PySpark, extracting data from Zoho, HubSpot, and Freshdesk, transforming it with business logic and SCD Type 1, and loading into Databricks Delta Lake, with Databricks Workflows orchestrating.

Designed and deployed scalable Lakehouse architectures (Bronze-Silver-Gold layers) aligned with Medallion Architecture, ensuring clean, reusable, and business-ready datasets for cross-department data-driven decisions.

Reduced project setup and tracking effort by 30%+ through custom JIRA automation and HubSpot-triggered project provisioning.

Automated data load validation with email alerts and database logging, improving proactive monitoring and debugging efficiency.

Implemented robust error handling, retry mechanisms, and advanced monitoring for API workflows, improving pipeline resilience and reducing debugging time and minimizing downtime.

Developed a modular, production-ready Python automation framework leveraging OOP principles to integrate multiple APIs (Zoho, Databricks, HubSpot, GCP, JIRA), supporting seamless data exchange and workflow automation.

Trivent system Ltd
|

Market Researcher - Intern

Chennai, Tamil Nadu, India

Summary

Conducted market research and enhanced lead generation processes by integrating Python and performing SEO analysis for client data.

Highlights

Utilized CRM for comprehensive data management and information updates, enhancing data accuracy and accessibility for client insights.

Conducted in-depth research on client data to identify market trends, supporting strategic decision-making and business development initiatives.

Integrated Python into the lead generation process, performed SEO website audits, and conducted competitor analysis to improve market intelligence and outreach effectiveness.

Volunteer

Entrepreneurship Cell 'IVY', Department of Management Sciences, Velammal Engineering College
|

Lead, Technical Team

Summary

Led the technical team for the Entrepreneurship Cell 'IVY', contributing to project development and technical strategy within the department.

GG CONSULTANCY, Department of Management Sciences, Velammal Engineering College
|

Team Member

Summary

Contributed as a team member to GG CONSULTANCY, engaging in consulting projects and collaborative problem-solving within the department.

Department of Management Sciences, Velammal Engineering College
|

Participant, Five Days Profit Challenge

Summary

Participated in the Five Days Profit Challenge, gaining practical experience in business strategy and financial analysis.

Education

Velammal College of Engineering
Chennai, Tamil Nadu, India

MBA

Business Analytics

Patrician College of arts and science
Chennai, Tamil Nadu, India

B.com

General

Certificates

Business Analysis

Issued By

Microsoft & LinkedIn

Power BI

Issued By

Infosys SpringBoard

MySQL

Issued By

UDEMY

Tableau

Issued By

Infosys SpringBoard

Skills

Programming & Scripting

Python, SQL, PySpark.

Data Engineering & Orchestration

Apache Spark, Apache Airflow, Databricks Workflows, ETL Development, Data Ingestion & Transformation, Pipeline Optimization.

Cloud & Data Platforms

Databricks, AWS (S3, Glue, Athena, Lambda), BigQuery.

APIs & Automation

REST APIs, API Integration (Zoho, HubSpot, Databricks, GCP, JIRA), Python Automation, Workflow Automation.

Databases

MySQL, Microsoft SQL Server.

Data Architecture

Lakehouse, Medallion Architecture, Delta Lake.

Collaboration & Version Control

Git, GitHub.

Projects

AWS Data Engineering Project: Weather & Flight Analytics Pipeline

Summary

Built a real-time data pipeline integrating weather (OpenWeather API) and flight tracking (OpenSky API) data within a Medallion Architecture (Bronze-Silver-Gold). Ingested JSON via AWS Lambda into S3, transformed with Glue & PySpark using a Haversine algorithm, delivering analytics-ready datasets. Orchestrated hourly data flows with Airflow and optimized queries using Athena for partition-pruned, cost-efficiency.

Databricks ETL Pipeline with PySpark, Delta Lake, and REST API Integration

Summary

Engineered an end-to-end ETL pipeline using PySpark to ingest and transform data from multiple REST APIs, including Zoho, HubSpot, and Custify. The data was structured within a Bronze-Silver-Gold Lakehouse architecture in Databricks Delta Lake, with final SQL views created to support dashboards and accelerate business insights.

Customer Onboarding Implementation

Summary

Implemented a standardized JIRA customer onboarding framework using custom workflows, automation for alerts, and API integration. Developed real-time Databricks dashboards to track KPIs like task aging and automated project creation from HubSpot deals using automation, enhancing project tracking and transparency.