Developed and orchestrated 4+ end-to-end ETL pipelines using Python and PySpark, extracting data from Zoho, HubSpot, and Freshdesk, transforming it with business logic and SCD Type 1, and loading into Databricks Delta Lake, with Databricks Workflows orchestrating.
Designed and deployed scalable Lakehouse architectures (Bronze-Silver-Gold layers) aligned with Medallion Architecture, ensuring clean, reusable, and business-ready datasets for cross-department data-driven decisions.
Reduced project setup and tracking effort by 30%+ through custom JIRA automation and HubSpot-triggered project provisioning.
Automated data load validation with email alerts and database logging, improving proactive monitoring and debugging efficiency.
Implemented robust error handling, retry mechanisms, and advanced monitoring for API workflows, improving pipeline resilience and reducing debugging time and minimizing downtime.
Developed a modular, production-ready Python automation framework leveraging OOP principles to integrate multiple APIs (Zoho, Databricks, HubSpot, GCP, JIRA), supporting seamless data exchange and workflow automation.