Description

With a strong background as a Senior Data Engineer, I bring several years of expertise in data and software engineering. Throughout my career, I have developed advanced technical skills combined with proven project management and team leadership abilities. I remain highly motivated and ready to leverage my full range of skills to contribute to the success of your organization.

Domaines d’expertise

Langues

Français
Bilingue ou natif
Anglais
Capacité professionnelle complète
Arabe
Bilingue ou natif

Préférences en matière de lieu de travail

Accepte de travailler sur site

Paris (jusqu’à 50 km)

Seloger
Senior Data Engineer
IMMOBILIER
février 2025 - février 2026 (1 an)
Paris, France
Context: Aviv Group is a leading European digital real estate company, operating property platforms across multiple countries. Within the Data team, I work on the design and optimization of geographical referential data systems that are critical to powering real estate search platforms across Europe. I also contribute to R&D initiatives exploring multimodal AI for innovative property search experiences.
Key Missions & Achievements:
▷ Geographical Data Platform
- Maintenance and optimization of geographical referential data pipelines supporting real estate platforms across multiple European countries.
- Design, development, and operation of large-scale APIs ensuring high availability, performance, and data consistency across geographies.
- Definition and implementation of data system architecture, enabling robust and scalable multi-country geographical data integration.
- Led stack optimization efforts across infrastructure, data pipelines, and API services.
- Mentoring of junior engineers, systematic code reviews, and promotion of engineering excellence through documentation and standardization.
▷ R&D — Multimodal RAG (Text-to-Image Search)
- Designed and built a multimodal RAG proof of concept enabling users to search for properties by describing their desired apartment style in natural language (layout, decoration,ambiance...).
- The system converts textual descriptions into embeddings and performs similarity search against listing images from the platform’s catalogue to find visually matching properties.
- Built an end-to-end pipeline combining LLM, multimodal embeddings, and vector search over the property images database.
- This POC paves the way for a groundbreaking property search experience, going beyond traditional filters (price, area, location) to offer visual intent-based search.
Technical Environment: Python, DBT, REST APIs, Git, CI/CD, Microservices Architecture, LLM, RAG
Python DBT API IaC DevOps
DECATHLON
Senior Data Engineer
GRANDE DISTRIBUTION
septembre 2022 - février 2025 (2 ans et 5 mois)
Paris, France
Context: Decathlon, the world’s largest sporting goods retailer, undertook the construction of a self-service Data Factory for ingestion to industrialize and democratize data access at group scale.
Within a cross-functional team (1 PO, 1 Tech Lead, 1 DevOps, 1 Fullstack, 3 Data Engineers), I played a central role in the design, development, and evolution of this ingestion platform handling massive volumes of heterogeneous data.
Key Missions & Achievements:
▷Self-Service Ingestion Platform
- Designed and developed a self-service data ingestion platform, enabling business teams to orchestrate their own ingestion workflows autonomously.
- Ingestion of over 1,000 heterogeneous data flows.
- Developed data pipelines in Scala and Spark following a medallion architecture on AWS.
- Integrated Databricks Autoloader for high-volume use cases, ensuring performant and reliable incremental ingestion.
- Developed a feature enabling the integration of Airbyte connectors into the platform, significantly expanding the catalogue of supported data sources.
▷Cross-Cloud Ingestion Agent
- Designed and developed a cross-cloud ingestion agent enabling data collection from Alibaba, GCP, and Azure into AWS.
- Agent built with Scala and ZIO (functional programming), ensuring robustness, scalability, and performance: ingestion of multiple GB of data with zero failure.
▷Observability & Data Lineage
- Established a platform-wide data observability strategy, enabling proactive anomaly detection and trust in ingested data quality using Great Expectations to automatically validate data quality, and OpelLineage for end-to-end data lineage tracking.
Business Impact: Platform used daily by dozens of Decathlon teams worldwide. Drastic reduction in time-to-data for business teams through self-service. Unification of multi-cloud data flows at international scale. Full data observability and lineage enabling confident, governance-compliant data consumption.
Scala Spark ZIO Kubernetes Amazon Web Services
SNCF
MLOps Engineer
TRANSPORTS
janvier 2022 - juillet 2022 (6 mois)
Paris, France
The SNCF AIFluence R&D project aims to predict passenger crowd levels in French railway stations to optimize traveler flow management and improve the customer experience. I was tasked with industrializing Machine Learning models developed by Data Scientists and building the MLOps infrastructure required for their production deployment.

Key Missions & Achievements:
- Built a complete MLOps stack for the industrialization of crowd prediction ML models.
- Code refactoring: transformed exploratory notebooks into production-grade, structured, and maintainable projects.
- Established software engineering best practices: modular code organization, Pull Request workflows, unit and integration testing, technical documentation.
- Deployed and configured Kubeflow for model training, versioning, and production monitoring.
- Set up CI/CD pipelines (GitLab CI) for automated deployments and experiment reproducibility.
- Infrastructure management: provisioning and maintenance of environments via Terraform and Kubernetes on AWS.

Business Impact: R\&D project that successfully delivered the industrialization of crowd prediction models. The MLOps stack fundamentally transformed the Data Scientists' working environment: transitioning from exploratory notebooks to an industrialized, reproducible, and deployable product. Teams now have a structured framework to train, version, and monitor their models autonomously, dramatically reducing the cycle from experimentation to production.

Technical Environment:
Python, Terraform, Kubernetes, Kubeflow, GitLab CI, AWS, Docker, CI/CD, MLOps, Infrastructure as Code
MLOps Python Terraform Kubernetes Google cloud