B. Kishore
Data Engineer
About
Results-driven Data Engineer with 1 year and 7 months of hands-on experience in building, optimizing, and automating robust data pipelines. Expertly leverages big data frameworks, ETL tools, and cloud technologies including AWS and Snowflake to design and implement efficient, reliable data migration and automation strategies. Proven ability to enhance data delivery, reduce manual tasks, and ensure data integrity for critical business operations.
Work
→
Summary
Currently serving as a Data Operations Analyst, responsible for designing, maintaining, and automating large-scale data pipelines and managing ETL workflows to ensure data delivery and accuracy.
Highlights
Engineered and automated robust, large-scale data pipelines using Spark, PySpark, Informatica, and AWS Glue, ensuring high data throughput and reliability.
Oversaw comprehensive ETL workflows, guaranteeing timely and accurate data delivery to support critical business operations.
Utilized SQL and Python for efficient data extraction, cleansing, and validation processes, orchestrating complex workflows with Apache Airflow and AWS Step Functions.
Facilitated seamless data migrations, constructing scalable S3-based data lakes and integrating diverse data sources with Snowflake for advanced analytics.
Spearheaded the implementation of advanced monitoring and automation solutions, successfully reducing manual operational tasks by 40% and enhancing system efficiency.
Collaborated cross-functionally with stakeholders to deliver actionable data insights and established comprehensive documentation for best practices, improving team efficiency and knowledge transfer.
→
Summary
Functioned as a Data Engineer on a telecom domain project, developing and optimizing pipelines for cloud data ingestion, transformation, and loading, focusing on high-volume data.
Highlights
Developed and optimized robust data pipelines for efficient ingestion, transformation, and loading of high-volume telecom data into cloud environments and Snowflake.
Architected and implemented high-performance ETL workflows specifically for high-volume telecom data using PySpark and AWS Glue.
Directed critical data migration efforts to AWS S3 and Snowflake, automating daily data loads and significantly enhancing the reliability and timeliness of KPI delivery.
Automated daily data loads and comprehensive health checks using Apache Airflow, ensuring consistent and reliable delivery of key performance indicators.
Partnered with client teams to establish stringent data quality rules and executed thorough data validation processes utilizing SQL and Python, ensuring data integrity.
Skills
Programming/Querying
SQL, Python, PySpark.
Big Data
Apache Spark, Hadoop.
ETL Tools
Informatica, AWS Glue.
Orchestration
Apache Airflow, AWS Step Functions.
Cloud Platforms
AWS Lambda, S3, Glue, Step Functions, Snowflake.
Development Tools
Google Colab, Git, JIRA.
Core Areas
Data Pipelines, Data Migration, Data Processing, Automation.