Analyzed complex source data in-depth to define business logic, column relationships, and data dependencies, ensuring precise data mapping and robust pipeline development.
Engineered and deployed end-to-end ELT pipelines using AWS Managed Airflow, Glue, S3, and dbt Core, processing over 2 million records daily for efficient data ingestion and cost-effective transformation into Amazon Redshift.
Optimized data loading processes by implementing incremental logic in AWS Glue using CDC timestamps, resulting in a 30% reduction in compute costs and accelerating processing time by 66% using incremental dbt models.
Automated data pipeline orchestration with Airflow, designing and implementing scheduling capabilities that reduced manual intervention by 80% and streamlined workflow efficiency.
Constructed comprehensive metadata tables to monitor pipeline health, facilitate rapid failure recovery, and ensure end-to-end data integrity.
Mentored 40 junior engineers in SQL and AWS cloud fundamentals, enhancing foundational skills through structured training, interactive exercises, and rigorous evaluations.