Syed Ayaan Ahmed

Data Scientist
Dallas, US.

About

Highly analytical Data Scientist with a strong foundation in fraud detection, risk analytics, and secure ML systems, evidenced by boosting BERT classification accuracy by 18% and reducing ML pipeline latency by 70%. Proven expertise in A/B testing, campaign evaluation, and developing robust dashboards, enabling data-driven policy decisions and significant operational risk reduction across diverse industries.

Work

VibeSea (AI Hiring Startup)
|

Data Science Intern

Remote, Remote, US

Summary

As a Data Science Intern, I leveraged advanced ML techniques to enhance classification accuracy and optimize data pipelines for a leading AI hiring startup, directly contributing to improved operational efficiency and data visibility.

Highlights

Boosted BERT classification accuracy by 18% across 50K+ job listings and 30+ categories, enhancing job matching precision.

Cut ML pipeline latency by 70% via DAG refactoring and SQL caching optimizations, significantly improving data processing speed.

Reconciled data mismatches between source and Tableau dashboards for accurate reporting, ensuring data integrity and reliability for key metrics.

Built self-serve Tableau reports for non-technical users, tracking pipeline and conversion KPIs to empower data-driven decisions.

Muffakham Jah College
|

Research Assistant – ML Project

Hyderabad, Telangana, India

Summary

As a Research Assistant, I developed a high-accuracy ensemble-based ML model for crop recommendation and co-authored an IEEE paper, contributing to agricultural innovation and farmer support.

Highlights

Developed an ensemble-based crop recommendation model achieving 98.9% accuracy using 300K+ records, enhancing agricultural productivity.

Co-authored an IEEE paper and deployed a Flask-based cloud tool for real-time crop recommendations to farmers, disseminating research into practical applications.

Socialtek AI (AI Product Team in Healthcare, Finance)
|

Data Scientist- NLP & GenAI Intern

Bengaluru, Karnataka, India

Summary

As a Data Scientist-NLP & GenAI Intern, I developed and deployed AI-powered solutions, including RAG chatbots and predictive features, to enhance financial and healthcare query resolution and optimize user engagement for an AI product team.

Highlights

Deployed a RAG-powered chatbot, resolving 65% of domain-specific finance/health queries and improving user support efficiency.

Conducted A/B testing to optimize preventive campaign reach and reduce user drop-off, enhancing user engagement and retention.

Designed predictive features instrumental in fraud/eligibility classification dashboards, bolstering system accuracy and security.

Collaborated with non-technical stakeholders to translate complex predictive insights into actionable outreach strategies, bridging technical and business objectives.

Education

The University of Texas at Dallas
Dallas, Texas, United States of America

MS

Business Analytics and Artificial Intelligence

Courses

ML

NLP

Deep Learning

Time Series

Big Data Systems

Prescriptive Analytics

Predictive Analytics

Osmania University
Hyderabad, Telangana, India

BS

Electronics and Communication

Certificates

IBM Data Science

Issued By

IBM

IBM Data Engineering

Issued By

IBM

Hugging Face NLP Bootcamp

Issued By

Hugging Face

Skills

Languages

Python, SQL, R, SAS, Bash, C++.

ML & Risk Modeling

scikit-learn, XGBoost, BERT, A/B Testing, Anomaly Detection, Time Series.

Tools & Infra

Tableau, Power BI, MLflow, Airflow, FastAPI, GitHub Actions.

Data & Cloud

BigQuery, Redshift, AWS (S3, Lambda), Azure (ADF, Blob), Hadoop.

NLP & GenAI

Hugging Face, Transformers, RAG, FAISS, LangChain.

Risk Practices

Fraud detection, Conversion funnel optimization, Lifecycle analytics, Compliance dashboards.

Projects

InsightGPT - Private GenAI Business Analyst

Summary

Developed InsightGPT, a privacy-first GenAI agent utilizing TinyLlama, FAISS, and Streamlit to analyze complex business data and generate actionable insights.

ClearCare Data Initiative – Healthcare Transparency

Summary

Led the ClearCare Data Initiative to standardize hospital billing data for CMS compliance and enhance healthcare transparency.

AskUTD Chatbot – GPT-Based University Info Assistant

Summary

Engineered AskUTD Chatbot, a GPT + FAISS multilingual solution, to automate university information queries and enhance student support.

H1B Insights - Sponsorship Prediction & Analytics Dashboard

Summary

Developed the H1B Insights project, a comprehensive sponsorship prediction and analytics dashboard, to provide strategic insights for international students.