Projects
Various personal, academic and professional projects.
A reference implementation and framework for building open-source data warehouses. It provide a comprehensive framework for data ingestion, transformation, modeling, and reporting into an production-ready workflow. Designed for LLM-based agentic workflows, it embeds domain context and structured documentation directly into the platform core.
A local data warehouse and predictive ML demo featuring orchestration, ingestion, transformation and an interactive dashboards.
Implementation and benchmarking of conformal prediction techniques across regression, classification, and survival analysis tasks.
Engineered a predictive pricing model for billboard advertisements using historical contract data. Implements Conformalized Quantile Regression to provide statistical guarantees on price intervals. Built a spatial feature engineering module using KD-Trees and Gaussian kernel smoothing to capture localized geographical trends from regression residuals.
Designed ELT pipelines and dimensional models to warehouse multi-channel marketing data. Developed Power BI datasets to enable cross-media performance analysis following dimensional modeling practices (fact, dims, star schema, SCD).
Designed and deployed over 20 interactive Power BI dashboards enabling media experts to optimize ad spend and clients to track their advertising campaigns.
A collection of six case studies demonstrating model interpretability and explainability methods across tabular, vision, NLP, and graph domains.
Formulation and resolution of a stochastic inventory control problem. Benchmarks Model-Based Dynamic Programming (Value Iteration via Bellman optimality) against Model-Free Reinforcement Learning (REINFORCE Policy Gradient) to compute optimal ordering policies.
Development and evaluation of NLI models to predict semantic relationships between sentence pairs. Explores encoder fine-tuning (CamemBERT via LoRA), decoder fine-tuning (Llama 3 via LoRA), and In-Context Learning techniques (Chain-of-Thought).
A distributed, privacy-preserving Federated Learning system designed for heart disease diagnosis. This project analyzes the impact of client data heterogeneity (Non-IID distributions) and benchmarks the stability and convergence of FedAvg against FedProx algorithms.
Designed and deployed an end-to-end computer vision pipeline to detect structural anomalies, incorrect placements, and degradation on billboards from audit photos. Automates manual physical quality control processes by classifying over 10,000 inspection images annually.
Implementation and performance analysis of the Non-Local Means algorithm for solving image restoration inverse problems.
Developed time-series forecasting architectures to predict cryptocurrency price movements. Designed and benchmarked classical econometric models (VARMAX, SARIMAX) against machine learning methods (XGBoost) and deep learning structures (LSTMs, Auto-encoders).
An empirical statistical study evaluating the causal impact of head coach replacements on sports team performance. Investigates whether structural improvements occur post-dismissal or if observed performance changes are driven by regression to the mean.
A comprehensive analysis and clean-room implementation of binary classification algorithms, including KNN (with Condensed Nearest Neighbor reduction), SVMs, and ensemble methods (AdaBoost, Random Forests). Features custom optimizations handling severe class imbalances.
Built a natural language processing classifier to detect spam within French text datasets using TF-IDF feature extraction and optimized Naive Bayes, Logistic Regression, and SVM models.
Implementations and comparative visualizations of manifold learning algorithms (Isomap, LLE, t-SNE) to map high-dimensional geometric datasets into lower-dimensional representations.
Developed RIMO, an open-source tool for high-throughput data anonymization and GDPR compliance within the LINO-PIMO software ecosystem. Engineered a production-ready Go engine from an initial Python prototype to optimize performance, concurrent processing, and strict type safety.
Built an end-to-end voucher targeting system to re-engage inactive users. Leveraged SQL for large-scale data extraction and Propensity Score Matching (PSM) to control for confounding selection bias, establishing the true causal impact and ROI of retention campaigns on downstream sales.
An interactive analytical web application and predictive tool forecasting vehicle fuel efficiency and emission rates based on structural and engine features.