SARANG DEB SAHA

Data Scientist | Machine Learning Engineer
Bengaluru, IN.

About

Highly analytical and results-oriented Data Scientist with expertise in deep learning, computer vision, and advanced analytics. Proven ability to develop and deploy production-grade machine learning solutions, optimize data pipelines, and drive impactful insights across diverse domains including financial technology, sports analytics, and road safety. Seeking to leverage strong technical skills and leadership experience to solve complex data challenges and contribute to innovative product development.

Work

Bureau Inc.
|

Analyst - Analytics

Summary

Collaborated with Data Engineering and Data Science teams to design scalable dashboards and automate data pipelines.

Highlights

Collaborated with Data Engineering and Data Science teams to design scalable Apache Superset dashboards and automate data pipelines using Python, SQL, and Airbyte, optimizing task scheduling with DAGs.

Rally Vision
|

Data Scientist

Summary

Developed and fine-tuned deep learning models for real-time sports ball tracking.

Highlights

Developed and fine-tuned deep learning models (TrackNet) for real-time squash ball tracking using Python, TensorFlow, PyTorch, and OpenCV.

Generated comprehensive datasets and processed match footage with FFmpeg, yielding key insights into player behavior and match dynamics.

Deployed models in Docker environments and collaborated with cross-functional teams to integrate tracking data into broadcasting systems, significantly enhancing viewer engagement.

Bureau Inc.
|

Data Scientist

Summary

Leading the end-to-end development of a flagship production tool for financial technology.

Highlights

Led end-to-end development of 'FinSpector', a flagship production tool for bank statement parsing, transaction categorization, and mule detection, positioning it among a select few platforms in India.

Engineered a robust PDF and OCR pipeline leveraging regex and NLP to achieve 97%+ accuracy in transaction extraction and classification, deploying a rule-based fraud engine to identify suspicious patterns.

Enabled clients to perform real-time creditworthiness assessment and fraud detection directly from uploaded statements, reducing manual underwriting time by 60%.

Positioned the tool for enterprise rollout with SaaS pricing, forecasting significant revenue opportunities through B2B lending partnerships.

CiSTUP, IISc (Indian Institute of Science)
|

Project Intern

Summary

Conducted a comparative study on road safety datasets and implemented computer vision models for traffic violation detection.

Highlights

Conducted a comparative study of road safety datasets and blackspot definitions across India, UK, US, and France, identifying critical feature gaps and inconsistencies in Indian data.

Implemented YOLO-based computer vision models with centroid tracking to detect traffic violations including triple riding, helmet-less riding, and wrong-way driving.

Developed a web scraping tool using BeautifulSoup4 to extract and structure FIR data from Karnataka (post-2016) via bounding box-based parsing.

Performed spatio-temporal analysis of violations across 50 Bengaluru traffic junctions, integrating IUDX, BTP, and meteorological datasets to identify violation trends and build predictive models.

Quidich Innovation Labs
|

Data Science Intern

Summary

Contributed to the development of a recommendation engine and improved real-time player tracking solutions.

Highlights

Developed a recommendation engine to automate storyline generation for cricket commentators.

Tested and benchmarked the company's real-time player tracker solution (QT and QStat), improving efficiency by 20%.

Contributed to the development of new YOLO-based object detection models for cricket ball tracking, leveraging highlight videos for real-time detection and tracking in QT (Quidich Tracker).

Education

The LNM Institute of Information Technology

B.Tech

Computer Science and Engineering

Grade: CGPA: 7.5

Courses

Served as the Chairman of ACM LNMIIT Students Chapter.

Served as the Coordinator of the Gender Sensitization and Equality Council.

Elected as the Senator of the Student's Gymkhana, LNMIIT 2022-23 (Student's Council body).

Delhi Public School

AISSCE

Physics, Chemistry, Maths, English

Grade: Percentage: 96%

Awards

Research Paper Acceptance: Exploring Anomaly Detection Techniques for Crime Detection

Awarded By

ICRTCIS 2024 / Springer Book Series

Paper accepted for presentation at ICRTCIS 2024 conference and published in the Springer Book Series 'Algorithms of Intelligent Systems'.

Research Paper: Cognizance of the Premier League (Peer-Review Process)

Awarded By

IJSMM by InderScience

Paper on football analytics currently undergoing peer-review for publication in a Q2 journal.

Chanakya UG Fellowship

Awarded By

iHUB DivyaSampark, IIT Roorkee

Awarded prestigious fellowship as the only shortlisted team from Jaipur.

TATA Crucible Campus Quiz 2022 Finalist (Rajasthan Zone)

Awarded By

TATA Crucible

Achieved a top 6 spot out of 20,000+ students in Rajasthan zone finals and 1st in Cluster wildcard round.

Publications

Exploring Anomaly Detection Techniques for Crime Detection

Published by

Springer Book Series 'Algorithms of Intelligent Systems'

Summary

Research on detecting anomalous events with criminal intent using deep learning, specifically Convolutional Neural Networks.

Cognizance of the Premier League: An In-Depth Exploration of Team Performance, Player Transfers, Referee Dynamics, and Player Position Prediction for Scouting

Published by

IJSMM by InderScience (peer-review process)

Summary

Exploration of the IPL auction market using statistical analysis and machine learning models to uncover insights into player valuation and predict auction prices.

Languages

English

Fluent

Hindi

Native

Skills

Programming Languages

Python, C++, C.

Libraries & Frameworks

PyTorch, TensorFlow, OpenCV, Scikit-learn, Matplotlib, Seaborn, Spacy, Apache Superset, Airbyte.

Tools & Platforms

Docker, Amazon AWS, GCP, Tableau, CleverTap, FFmpeg, BeautifulSoup4, HTML5, CSS, Javascript, PHP, Unreal Engine.

Databases

MySQL.

Machine Learning

Deep Learning, Computer Vision, Predictive Modeling, Anomaly Detection, Recommendation Systems, Regression Models, YOLO, TrackNet, CNNs (VGG19, DenseNet121, ResNet50, MobileNetV2).

Data Analysis & Engineering

Exploratory Data Analysis (EDA), Data Pipelines, SQL, Data Scraping, Video Analytics, Spatio-temporal Analysis, Dataset Creation.

Projects

Cognizance of the Premier League

Summary

Exploratory data analysis and predictive modeling in football analytics, focusing on Premier League teams and the transfer market.

VaahanFlow

Summary

Developed a real-time traffic density estimation system using YOLOv8 for urban traffic management.

Evidence vs Eminence

Summary

Explored the IPL auction market using statistical analysis and machine learning models to uncover insights into player valuation and predict auction prices.