Muhammad Zaid

Senior Azure Data Engineer | AWS Certified Cloud Practitioner | AWS Certified Solutions Architect

Delhi, IN.

About

Highly accomplished Big Data Engineer with 3.3 years of expertise in designing, implementing, and optimizing robust ETL processes and cloud-based data platforms. Proven ability to leverage Python, SQL, Apache Spark, and advanced Azure/AWS services to engineer scalable data pipelines, enhance data quality, and drive significant improvements in reporting performance and cost efficiency. Adept at collaborating with cross-functional teams to deliver high-impact data solutions that reduce project costs by up to 20%.

Work

HCLTech

Senior Software Engineer

Noida, Uttar Pradesh, India

Nov 2024

→

Present

Summary

Led enterprise data migration and optimized scalable data pipelines, enhancing analytics and reporting for cloud-based solutions.

Highlights

Spearheaded the successful migration of enterprise data from on-premises Oracle SQL database to Azure Data Lake Storage Gen2, establishing a secure and scalable foundation for cloud-based analytics.

Engineered and optimized high-performance data pipelines with Apache Spark, achieving a 40% reduction in data latency across large distributed datasets.

Developed and fine-tuned over 30 complex SQL queries and stored procedures, boosting ETL efficiency and reporting performance by 35%.

Implemented comprehensive data validation, schema checks, and unit testing frameworks, ensuring end-to-end pipeline reliability and enhancing data quality assurance.

Integrated diverse Azure services, including Data Factory, Data Lake Storage Gen2, and Databricks, to establish secure, code-driven data movement and transformation workflows.

Wipro

Data Engineer

Greater Noida, Uttar Pradesh, India

Apr 2022

→

Nov 2024

Summary

Facilitated data migration POCs to Azure and implemented PySpark optimizations to enhance data processing efficiency and cluster utilization.

Highlights

Facilitated critical Proof-of-Concepts (POCs) for Azure data migration, achieving 100% accuracy and efficiency in data ingestion, transformation, and storage processes.

Drove 100% consistency across a 4-member team during POCs focused on developing robust data workflows within Azure Data Factory.

Implemented advanced PySpark optimizations, including partitioning, broadcast joins, and caching, significantly reducing job runtimes and enhancing cluster utilization.

Executed comprehensive performance tuning on Spark jobs and SQL queries, resulting in a 30-40% improvement in data processing times during critical POC validation cycles.

Education

Dr Virendra Swaroop Institute of Computer Studies

Jan 2019

→

Jan 2022

Bachelor

Computer Application

Grade: 72%

Languages

English

Certificates

AWS Certified Cloud Practitioner

Issued By

Amazon Web Services (AWS)

AWS Certified Solutions Architect - Associate

Issued By

Amazon Web Services (AWS)

Skills

Cloud Platforms

Microsoft Azure, Azure SQL Database, Azure Data Factory, Azure Databricks, Amazon Web Services (AWS), EC2, S3, Lambda, Glue, ADLS Gen2, Delta Lake.

Big Data Technologies

Apache Spark, PySpark, Hadoop Ecosystem, Apache Hive, Kafka.

Programming & Scripting

Python, SQL (Advanced Queries).

Relational Databases

Oracle, PostgreSQL.

Data Visualization & Reporting

Microsoft Excel (Data Analysis, Pivot Tables, Formulas, Charts), Power BI, Tableau.