You're seeing this page as if you were . The main menu is still yours, though. Exit from immersion
Houssem BziouchHB

Houssem Bziouch

Data Engineer

700 €/jour
Paris, FR
8-15 ans

Délai de réponse moyen : 1h

À propos de Houssem

nior Data Engineer | High-performance data platforms architect

I design scalable, resilient and production-grade data platforms turning raw data into high-value analytics products.

Expertise in:
• Medallion architectures (Bronze → Silver → Gold)
• Spark / PySpark at scale
• Cloud data platforms (AWS, Databricks, Kafka)
• Data quality frameworks & observability

I bring a high-engineering standard mindset: clean code, automation, performance tuning, production readiness.

🎯 My mission: build reliable, scalable and future-proof data systems.



Senior Data Engineer | Architecte de plateformes data haute performance

Je conçois des plateformes data robustes, scalables et orientées fiabilité pour transformer des volumes massifs de données brutes en produits analytiques à forte valeur business.

Spécialisé dans :
• Architectures Bronze → Silver → Gold
• Spark / PySpark à grande échelle
• Cloud data platforms (AWS, Databricks, Kafka)
• Data quality, validation & monitoring avancés

Mon approche : engineering de haut niveau, performance, clean code, automatisation et standards production.
Je m’investis dans chaque projet comme un owner, avec un fort sens du détail et de la qualité.

🎯 Mission : construire des systèmes data durables, performants et exploitables par les métiers.





  • Anglais

    Bilingue ou natif

  • Français

    Bilingue ou natif

  • Arabe

    Bilingue ou natif

  • Allemand

    Capacité professionnelle limitée

Accepte de travailler sur site
Paris (jusqu’à 50 km)

Expériences

  • Quantum Signals,
    Senior Data Engineer
    mars 2025 - Aujourd'hui (1 an et 3 mois)
    California, USA
    • • Architected a production-grade Bronze → Silver → Gold platform for high-frequency market data (Databento futures & equities), enabling research and trading ready datasets from raw ticks.
    • • Designed a manifest-driven incremental engine (per symbol/day) guaranteeing idempotence, restart safety and deterministic outputs across replays, backfills and partial-day scenarios.
    • • Led Databricks → self-hosted Spark migration (Hetzner), improving cost control and throughput through shuffle tuning, S3A committers optimization and Parquet layout strategies.
    • • Implemented a strict data correctness framework (DuckDB + automated validation): historical parity checks, numeric drift detection and Silver/Gold coverage reconciliation.
    • • Solved critical market-data integrity issues: sentinel normalization (9223372036854775807), price scaling (1e5) and timestamp semantics (nanoseconds → UTC and NY trading sessions).
    • • Built CI quality gates (GitHub Actions) enforcing schema stability, metric correctness and end-to-end pipeline reliability.
    • • Owned architecture, release lifecycle and reliability standards in close collaboration with research and trading teams.
    • • Tech: PySpark, DuckDB, Databricks, AWS S3, Parquet, Linux, Bash, GitHub Actions, JSON-driven specs.
    Python SQL Test A/B Cloud AWS Processus ETL (Extract, Transform, Load)
  • BNP Paribas,
    Data Engineer
    novembre 2022 - mars 2025 (2 ans et 4 mois)
    Paris, France
    • • AML & Supply Chain (QUANTEXA): led Spark pipelines for AML compliance and delivered a daily reporting system surfacing country-level AML KPIs.
    • • KYC Integration (BNP DataHub): implemented end-to-end ETL workflows to ingest, monitor and supervise transaction feeds; secured outputs stored in IBM S3.
    • • GCARS Decommissioning: migrated legacy Python/Pandas processes to Spark + IBM S3, improving scalability and operational reliability.
    • • Phonetic Search (BNP Switzerland): built NLP pipelines using stemming, lemmatization and phonetic hashing to support entity matching analytics.
    • • ETL Engineering: designed robust transformations from CSV and private cloud sources into refined datasets and KPIs, orchestrated with Airflow and productionized with CI/CD.
    • • Tech: Apache Spark, Apache Airflow, Docker, SQL/NoSQL, Git, Autosys, Jenkins.
  • Bpifrance,
    Data Engineer
    avril 2022 - novembre 2022 (7 mois)
    Paris, France
    • • Financial Monitoring (CDC): built a detection platform consolidating multi-institution datasets to identify irregular transaction patterns across EU/US accounts.
    • • Engineered and optimized Spark-based AWS Glue ETL ingesting heterogeneous sources into raw S3 data lakes.
    • • Ensured daily data quality investigations in Athena; partnered with BAs/PMs via Jira to deliver prioritized features.
    • • Delivered internal data products via APIs (Flask, FastAPI, API Gateway) with automated deployments using CodeDeploy.
    • • Tech: AWS Glue, Spark, S3, Athena, MongoDB, Flask/FastAPI, API Gateway, CodeDeploy, Jira.

Recommandations

Soyez le premier à recommander Houssem

Contribuez à la réussite de ce freelance en partageant votre expérience de collaboration avec lui.

Ces profils de freelance correspondent également à vos critères

AgathaA

Agatha Frydrych

Backend Java Software Engineer

4.7

(3)

2

BaptisteB

Baptiste Duhen

Fullstack developer

4.6

(4)

5

AmedA

Amed Hamou

Senior Lead Developer

4

(2)

7

AudreyA

Audrey Champion

Web developer

4.3

(3)

4

Formations

  • Engineering Degree in Computer Science
    École Polytechnique de Sousse
    2016
    Engineering Degree in Computer Science

Compétences

Catégories