You're seeing this page as if you were . The main menu is still yours, though. Exit from immersion
Tahar SouiciTS

Tahar Souici

Senior Data/ML engineer

700 €/jour
Paris, FR
8-15 ans

Délai de réponse moyen : 1h

À propos de Tahar

With a strong background as a Senior Data Engineer, I bring several years of expertise in data and software engineering. Throughout my career, I have developed advanced technical skills combined with proven project management and team leadership abilities. I remain highly motivated and ready to leverage my full range of skills to contribute to the success of your organization.
  • Français

    Bilingue ou natif

  • Anglais

    Capacité professionnelle complète

  • Arabe

    Bilingue ou natif

Accepte de travailler sur site
Paris (jusqu’à 50 km)

Expériences

  • Seloger
    Senior Data Engineer
    IMMOBILIER
    février 2025 - février 2026 (1 an)
    Paris, France
    Context: Aviv Group is a leading European digital real estate company, operating property platforms across multiple countries. Within the Data team, I work on the design and optimization of geographical referential data systems that are critical to powering real estate search platforms across Europe. I also contribute to R&D initiatives exploring multimodal AI for innovative property search experiences.
    Key Missions & Achievements:
    ▷ Geographical Data Platform
    - Maintenance and optimization of geographical referential data pipelines supporting real estate platforms across multiple European countries.
    - Design, development, and operation of large-scale APIs ensuring high availability, performance, and data consistency across geographies.
    - Definition and implementation of data system architecture, enabling robust and scalable multi-country geographical data integration.
    - Led stack optimization efforts across infrastructure, data pipelines, and API services.
    - Mentoring of junior engineers, systematic code reviews, and promotion of engineering excellence through documentation and standardization.
    ▷ R&D — Multimodal RAG (Text-to-Image Search)
    - Designed and built a multimodal RAG proof of concept enabling users to search for properties by describing their desired apartment style in natural language (layout, decoration,ambiance...).
    - The system converts textual descriptions into embeddings and performs similarity search against listing images from the platform’s catalogue to find visually matching properties.
    - Built an end-to-end pipeline combining LLM, multimodal embeddings, and vector search over the property images database.
    - This POC paves the way for a groundbreaking property search experience, going beyond traditional filters (price, area, location) to offer visual intent-based search.
    Technical Environment: Python, DBT, REST APIs, Git, CI/CD, Microservices Architecture, LLM, RAG
    Python DBT API IaC DevOps
  • DECATHLON
    Senior Data Engineer
    GRANDE DISTRIBUTION
    septembre 2022 - février 2025 (2 ans et 5 mois)
    Paris, France
    Context: Decathlon, the world’s largest sporting goods retailer, undertook the construction of a self-service Data Factory for ingestion to industrialize and democratize data access at group scale.
    Within a cross-functional team (1 PO, 1 Tech Lead, 1 DevOps, 1 Fullstack, 3 Data Engineers), I played a central role in the design, development, and evolution of this ingestion platform handling massive volumes of heterogeneous data.
    Key Missions & Achievements:
    ▷Self-Service Ingestion Platform
    - Designed and developed a self-service data ingestion platform, enabling business teams to orchestrate their own ingestion workflows autonomously.
    - Ingestion of over 1,000 heterogeneous data flows.
    - Developed data pipelines in Scala and Spark following a medallion architecture on AWS.
    - Integrated Databricks Autoloader for high-volume use cases, ensuring performant and reliable incremental ingestion.
    - Developed a feature enabling the integration of Airbyte connectors into the platform, significantly expanding the catalogue of supported data sources.
    ▷Cross-Cloud Ingestion Agent
    - Designed and developed a cross-cloud ingestion agent enabling data collection from Alibaba, GCP, and Azure into AWS.
    - Agent built with Scala and ZIO (functional programming), ensuring robustness, scalability, and performance: ingestion of multiple GB of data with zero failure.
    ▷Observability & Data Lineage
    - Established a platform-wide data observability strategy, enabling proactive anomaly detection and trust in ingested data quality using Great Expectations to automatically validate data quality, and OpelLineage for end-to-end data lineage tracking.
    Business Impact: Platform used daily by dozens of Decathlon teams worldwide. Drastic reduction in time-to-data for business teams through self-service. Unification of multi-cloud data flows at international scale. Full data observability and lineage enabling confident, governance-compliant data consumption.
    Scala Spark ZIO Kubernetes Amazon Web Services
  • SNCF
    MLOps Engineer
    TRANSPORTS
    janvier 2022 - juillet 2022 (6 mois)
    Paris, France
    The SNCF AIFluence R&D project aims to predict passenger crowd levels in French railway stations to optimize traveler flow management and improve the customer experience. I was tasked with industrializing Machine Learning models developed by Data Scientists and building the MLOps infrastructure required for their production deployment.

    Key Missions & Achievements:
    - Built a complete MLOps stack for the industrialization of crowd prediction ML models.
    - Code refactoring: transformed exploratory notebooks into production-grade, structured, and maintainable projects.
    - Established software engineering best practices: modular code organization, Pull Request workflows, unit and integration testing, technical documentation.
    - Deployed and configured Kubeflow for model training, versioning, and production monitoring.
    - Set up CI/CD pipelines (GitLab CI) for automated deployments and experiment reproducibility.
    - Infrastructure management: provisioning and maintenance of environments via Terraform and Kubernetes on AWS.

    Business Impact: R\&D project that successfully delivered the industrialization of crowd prediction models. The MLOps stack fundamentally transformed the Data Scientists' working environment: transitioning from exploratory notebooks to an industrialized, reproducible, and deployable product. Teams now have a structured framework to train, version, and monitor their models autonomously, dramatically reducing the cycle from experimentation to production.

    Technical Environment:
    Python, Terraform, Kubernetes, Kubeflow, GitLab CI, AWS, Docker, CI/CD, MLOps, Infrastructure as Code
    MLOps Python Terraform Kubernetes Google cloud

Recommandations

Soyez le premier à recommander Tahar

Contribuez à la réussite de ce freelance en partageant votre expérience de collaboration avec lui.

Ces profils de freelance correspondent également à vos critères

AgathaA

Agatha Frydrych

Backend Java Software Engineer

4.7

(3)

2

BaptisteB

Baptiste Duhen

Fullstack developer

4.6

(4)

5

AmedA

Amed Hamou

Senior Lead Developer

4

(2)

7

AudreyA

Audrey Champion

Web developer

4.3

(3)

4

Formations

  • Ingénieur Systèmes
    Ecole Militaire Polytechniques
    2015
  • Master 2 - Data science
    Paris 8
    2021

Compétences (39)

Catégories