Shivam Rajput

Data Engineer

Bengaluru, IN.

About

Highly analytical Data Engineer with a Bachelor of Technology from IIT (BHU) Varanasi, specializing in building and optimizing scalable data pipelines and distributed data lake architectures. Proven ability to leverage AWS services, Apache technologies, and advanced programming to achieve significant efficiency gains, including a 30% reduction in data migration time. Recognized for strong problem-solving skills, evidenced by a top 0.5% rank in JEE Advanced and a 2nd place win in the Gen AI Hackathon for an AI-powered staffing recommendation system.

Work

Accordion

Data Engineer

Bengaluru, Karnataka, India

Jun 2024

→

May 2025

Summary

Engineered and maintained robust ETL pipelines utilizing AWS services to ensure seamless data ingestion, transformation, and loading processes.

Highlights

Developed a Python script leveraging boto3 and concurrent.futures to optimize data migration between Amazon S3 buckets, achieving a 30% reduction in transfer time.

Created and optimized stored procedures in Amazon Redshift for complex data operations, significantly enhancing performance and scalability.

Maintained robust ETL pipelines using AWS Glue, Amazon Redshift, and Amazon S3, ensuring seamless data ingestion, transformation, and loading processes.

Physics Wallah

Data Engineer

Bengaluru, Karnataka, India

May 2025

→

Jul 2024

Summary

Currently leads the development and optimization of robust data pipelines and data lake architecture for scalable and performant analytics.

Highlights

Built and scheduled data pipelines using Apache Airflow to ingest data from Google Sheets, REST APIs, and MongoDB into Trino tables, ensuring reliable data availability.

Implemented Debezium and Kafka for real-time change data capture from MongoDB collections, centralizing data into the core platform.

Contributed to an in-house data architecture leveraging Apache Iceberg, Amazon S3, and Trino, optimizing query performance for scalable and performant analytics.

Supported data transformation workflows using Apache Spark for efficient batch processing within a distributed data lake environment.

Education

Indian Institute of Technology (BHU), Varanasi

Varanasi, Uttar Pradesh, India

Nov 2020

→

May 2024

Bachelor of Technology

Technology

Awards

2nd Rank, Accordion Gen AI Hackathon 2024

Nov 2024

Awarded By

Accordion

Awarded for developing an innovative AI-powered staffing recommendation system that leveraged NLP and embedding-based similarity search.

Codeforces Expert (Max rating 1727, Global Rank 675)

Mar 2024

Awarded By

Codeforces

Achieved Codeforces Expert status with a max rating of 1727 and a Global Rank of 675 in Codeforces Round 927, solving over 350+ Data Structures & Algorithms problems, showcasing advanced problem-solving and algorithmic skills.

Top 0.5% Rank, JEE ADVANCED 2020

Jan 2020

Awarded By

JEE ADVANCED

Achieved a top 0.5% ranking among over 1 Million candidates in the highly competitive JEE ADVANCED 2020 examination, demonstrating exceptional aptitude in science and engineering.

Languages

English

Skills

Programming Languages

Python, SQL, C++.

Big Data & Data Engineering

Apache Airflow, Apache Spark, Kafka, Apache Iceberg, Debezium, Trino, ETL, Data Pipelines, Data Lake, Distributed Systems, Data Warehousing.

Cloud Platforms & Databases

AWS Glue, Amazon Redshift, Amazon S3, boto3, MySQL, FAISS.

Web Frameworks & DevOps

Django, Flask, Streamlit, Docker, RabbitMQ.

Artificial Intelligence & Machine Learning

NLP, Embedding-based Similarity Search, AI-powered Recommendation Systems.

Projects

Staffing Assistant (Gen AI Hackathon Project)

Nov 2024

→

Nov 2024

Summary

Developed an AI-powered staffing recommendation system leveraging NLP and embedding-based similarity search, securing 2nd place in the Gen AI Hackathon.

Product Management System

Aug 2024

→

Sep 2024

Summary

Developed a microservices-based system for product management and user interactions, focusing on efficient, decoupled services.