Guillaume Rakotonjanahary TSANTANIAINA

Data Scientist / AI Engineer
IVK 82 TER A, 102, Madagascar, MG.

About

Highly accomplished Fullstack Data & AI Engineer with over 4 years of experience, specializing in architecting and deploying scalable AI and Fullstack solutions across Google Cloud Platform and AWS. Proven expertise in developing complex systems, including web applications, data/ML pipelines, conversational AI, computer vision, and RAG architectures for international clients in healthcare, fintech, and e-commerce. Eager to leverage deep technical skills and performance-oriented approach on challenging large-scale projects.

Work

ETech Consultant
|

Data Scientist & AI Fullstack Engineer

Madagascar, Madagascar, Madagascar

Summary

Led end-to-end development of AI engineering projects, from architecture to implementation, including advanced RAG systems, predictive analytics, and computer vision solutions.

Highlights

Designed and implemented a Hybrid RAG Architecture combining vector search and knowledge graphs (Neo4j, Pinecone, Elasticsearch) for a thesis project, including an extraction pipeline for cloud documents and images with adaptive chunking.

Developed an Automated BI Assistant with predictive analysis capabilities, leveraging LangGraph, Vertex AI, and BigQuery ML, with fine-tuning for specific SQL syntax to generate analytics dashboards with automatic visualizations.

Engineered an end-to-end CV Recommendation Chatbot with ML matching, utilizing NER for skills extraction and NLP for semantic analysis, delivering a React interface for recruiters with advanced filters.

Built a No-Code Multimodal E-commerce Chatbot (Image + Text) with configurable vision/text modes, implementing CLIP for image search and optimizing response time to under 3 seconds.

Developed a Computer Vision System for Retail Inventory Counting using YOLO V8 and TensorRT optimization, enabling real-time multi-class detection and counting for supermarkets with a REST API integration.

CHISU USAID Program Consultant
|

Data Engineer & Systems Developer

Madagascar, Madagascar, Madagascar

Summary

Provided expert data engineering and systems development for USAID programs, focusing on public health data infrastructure and analytical tools.

Highlights

Architected and developed an automatic bulletin module, ingesting multi-source data (SQL Server, DHIS2 API) and enabling visualization of health KPIs across various dimensions.

Designed and implemented an NL-to-SQL Chatbot for DHIS2 queries and malaria analysis, leveraging LangChain and fine-tuned GPT-3.5 Turbo for SQL generation, integrated with a REST API and Chart.js for epidemiological bulletins.

Configured and deployed DHIS2 infrastructure for the National Nutrition Office (ONN), establishing dev/staging/prod environments, customizing data collection modules, and optimizing performance with Redis cache for key indicators (WASH, PSERAN, PARN).

eTech CDI
|

Data Scientist & Fullstack AI Engineer

Madagascar, Madagascar, Madagascar

Summary

Led the architecture, development, and deployment of advanced AI and data solutions, optimizing biometric data analysis and generating actionable insights for clients.

Highlights

Architected and implemented a Biometric Enrollment KPIs OLTP to OLAP Data Warehouse, migrating PostgreSQL to BigQuery star schema, enabling comprehensive KPI visualization with Looker Studio and predictive ML for forecasting and anomaly detection.

Developed and deployed a Real-Time Biometric Analysis Platform using CDC replication via Airbyte, delivering end-to-end data flow from PostgreSQL to BigQuery and Looker Studio for real-time KPI monitoring.

Designed and implemented a multi-agent orchestration-based lead generation system using Google ADK and LLM models, integrating multi-source data analysis to identify pattern-based investment signals for competitive benchmarking.

Engineered a Text-to-SQL RAG Chatbot for BigQuery queries, utilizing LangGraph and fine-tuned SQLCoder-7B-2, delivering a Python FastAPI backend and React TypeScript frontend for multi-query support with persistent context.

Developed a Multilingual Airport Conversational Assistant with a microservices architecture on GKE, integrating WhatsApp Business API and multilingual NLP (translation, sentiment analysis) to provide operational support for staff and passengers.

PMI Measure Malaria USAID Consultant
|

Data & Fullstack Engineer

Madagascar, Madagascar, Madagascar

Summary

Delivered critical data engineering and fullstack development for public health initiatives, including geolocation, vaccine management, and data quality control systems.

Highlights

Developed a Healthcare Geolocation and Vaccine Management platform with a React Native mobile application (offline-first) and Python FastAPI REST API, optimizing routing algorithms and enabling real-time vaccine stock tracking and interactive map visualizations.

Implemented a SARGEC Digital Library with an Intelligent Search Engine, developing a document management system with NLP for semantic indexing, automatic categorization, and OCR for physical document digitization.

Architected an Automated BI Assistant for Health Data Analysis, integrating DHIS2 and external sources to develop automated reports and interactive visualizations with a Time Series Forecasting Model (ARIMA) for decision-makers.

Designed and developed a COVID-19 Data Anomaly Automatic Rectification Pipeline within DHIS2, implementing ML algorithms (Isolation Forest, DBSCAN) to automatically detect and rectify duplicates, missing values, and inconsistencies, improving data quality.

Education

IT University & ESTIA France
Bidart, Nouvelle-Aquitaine, France

Master MSc

Big Data and Artificial Intelligence

IT University
Antananarivo, Analamanga, Madagascar

Bachelor's Degree

Computer Science, specialization in Development

Sainte Famille Mahamasina
Antananarivo, Analamanga, Madagascar

Baccalaureate

Scientific Baccalaureate, Series C

Awards

3ème place - Hackathon INDABAX Madagascar 2023

Awarded By

INDABAX Madagascar

Awarded 3rd place for developing an ML multi-class classification model (10 classes) with 94% accuracy, featuring a complete NLP pipeline (preprocessing, feature engineering, model selection) using Python, XGBoost, CatBoost, TF-IDF, Word22Vec, spaCy, and scikit-learn.

Languages

English
French
Malagasy

Certificates

Advanced Level C1 EF-SET Certified

Issued By

EF Education First

Certification Python 3 Coding Game

Issued By

Coding Game

English for business and Entrepreneurship

Issued By

HP Life

Presenter des données par HP Life

Issued By

HP Life

ITTI Spoken English Course

Issued By

ITTI

Skills

Cloud Computing

Google Cloud Platform, Amazon Web Services.

Fullstack Development

React TypeScript, React Native Mobile, Python, FastAPI, REST/SOAP API, HTTP Protocol, Unit Testing, PyTest, React Testing Library, Postman, Swagger, GOF Design Patterns, MVC Design Pattern, Microservices, Serverless Architecture, Event-Driven Architecture, Pub/Sub Architecture.

Algorithms & Optimization

Graph Algorithms, Dijkstra, A*, Kruskal, BFS, DFS, Constraint Programming, OR-Tools, MiniZinc, Convex Optimization, Gradient Descent, Newton Method, L-BFGS, CVXPY, SciPy.optimize, Heuristic Algorithms, Metaheuristic Algorithms, Genetic Algorithms (GA), Particle Swarm Optimization (PSO), Bin Packing.

Machine Learning & Deep Learning Model Creation

Computer Vision, YOLO, CNN, TensorFlow, OpenCV, CLIP, Regression, Linear Regression, Ridge, Lasso, SVR, Scikit Learn, Classification, Random Forest, SVM, XGBoost, Logistic Regression, Fraud Detection, Clustering, Unsupervised Learning, PCA, KMeans, DBSCAN, Agglomerative Clustering, KNN, BigQuery ML, Spark ML, Time Series Analysis, ARIMA, Prophet, LSTM, Holt-Winters, Deep Reinforcement Learning, Q-Learning, PPO, MLOps, Model Deployment, Amazon SageMaker, Google AutoML, Python FastAPI, Deep Learning, MLP, GNN, RNN, GANN, Transformers.

Automation, Agent Development and Model Utilization

RAG (Retrieval-Augmented Generation), Vector Search, Graph Search, Hybrid RAG, Agentic RAG, Multimodal RAG, OCR, NLP, Google Gemini, Amazon Textract, Lambda, MCP, ADK, Langgraph, N8N, A2A, ACP, Langchain, LlamaIndex, Langflow, Google Dialogflow, Model Fine-tuning, Hugging Face, Speech to Text (TTS), Text to Speech (Whisper), LLM/Embedding models, Amazon Bedrock, Google Vertex AI, GPT, Claude, LLaMA.

Data Processing & Data Science

Complex Data Analysis, Data Cleaning, Feature Preparation, Pandas, NumPy, SciPy, Statsmodels.

Data Engineering and Big Data

ETL, ELT pipeline design, Large Data Volumes, Distributed Processing, Apache Spark, Apache Beam, Google Dataflow, AWS Glue, Google Dataproc, Workflow Orchestration, Apache Airflow, AWS MWAA, Google Cloud Composer, Web Scraping, Selenium, Cypress.

Database Management Systems

Relational Databases, NoSQL Databases, Graph Databases, Data Warehouse Databases, PostgreSQL, MongoDB, Neo4j, Google BigQuery, Amazon Redshift, DynamoDB, Amazon Neptune.

DevOps & Deployment

Infrastructure as Code, Terraform, AWS CloudFormation, CI/CD, GitHub Actions, Jenkins, AWS CodePipeline, Containerization, Docker, Orchestration, AWS ECS, Kubernetes, API Deployment, Model Deployment, EC2, ECS, Lambda, SageMaker, Version Control, GitHub, GitLab.

IT Project Management & Collaboration

Agile Methodologies, Scrum, V-cycle, UML, Functional/Technical Backlog Creation, User Story Mapping, Business Rules, Mockups, FIGMA, Jira, Google Sheets, Excel, Draw.io, Quotes, Estimation, Planning.