À propos de Gaëtan
Français
Bilingue ou natif
Anglais
Capacité professionnelle complète
Italien
Capacité professionnelle limitée
Allemand
Notions
Expériences
- Ministère des ArméesData EngineerSECTEUR PUBLIC & COLLECTIVITÉSmars 2023 - Aujourd'hui (3 ans et 3 mois)Paris, FranceClassified
- DataDomeData EngineerTÉLÉCOMMUNICATIONSaoût 2021 - avril 2022 (8 mois)Paris, FranceDatadome is a real-time bot detection solution that relies on a Flink engine that myteam was taking care of. I helped build Scala and Python components that fed andenabled sets of rules used to decide whether or not a fingerprint was a bot or a realhuman.Enabling a protection across multiple time zones, the engine had to keep a low latencyeven during peak activity or ddos attacks. The key challenges were to assess theconsistency of performance and not deteriorate it when adding features.
- AdotData EngineerTÉLÉCOMMUNICATIONSseptembre 2016 - août 2021 (4 ans et 11 mois)Île-de-France, FranceFollowing a major business deal, I successfully integrated a new data source, whichinitially posed challenges due to its scattered and decentralized nature. The data wassourced from various CRM tools across different parts of the organization, making itdifficult for data analysts to extract meaningful insights.To address this, I designed and implemented a robust data processing system usingSpark and Airflow. This system aggregated, cleansed, and assembled the data intoParquet files stored in an S3 repository. With the capability to ingest several terabytesof data daily, it utilized a cluster of approximately 300 cores.The processed data was made accessible through a Presto SQL server running on adozen of nodes, resulting in a high-performance database. This database served as thefoundation for over 100 daily analyses, catering to both clients and internal projects.Additionally, data scientists utilized this resource to create and train their machinelearning models, while the production pipeline extracted subsets of data for running adcampaigns.I played a vital role in maintaining and expanding the streaming data pipeline, whichprocessed more than 100 terabytes of data per day. This pipeline consumed data fromvarious topics at speeds of up to 200,000 messages per second, running more than 30jobs that handled different data formats, including SparkSQL batches and Kafka eventsusing Akka and Spark streaming.In addition, I provided technical support to data scientists. One notable project involvedimproving a legacy ML component responsible for predicting tags on ad events fromthe production database. We addressed challenges posed by increasing throughputand an aging training dataset by collaborating with data scientists. Together, weupdated the model using TensorFlow neural networks, and I established a Scalaservice using Akka Stream to handle input streams and distribute the workload acrossa pool of TensorFlow servers. This enhancement enabled the component to handle theentire input stream, processing up to 10,000 events per minute with the same numberof nodes, resulting in improved tag predictions and confidence scores. Furthermore, weensured the training dataset was regularly updated.Additionally, I had the opportunity to mentor and onboard junior team members,assisting them in acquiring essential skills and best practices within the field.
Recommandations
Soyez le premier à recommander Gaëtan
Contribuez à la réussite de ce freelance en partageant votre expérience de collaboration avec lui.
Ces profils de freelance correspondent également à vos critères
Agatha Frydrych
Backend Java Software Engineer
4.7
(3)
2
Baptiste Duhen
Fullstack developer
4.6
(4)
5
Amed Hamou
Senior Lead Developer
4
(2)
7
Audrey Champion
Web developer
4.3
(3)
4
Formations
- Computer sciences & business intelligence master student, Informatique décisonnellePolytech'Nantes2016Computer sciences & business intelligence master student, Informatique décisonnelle
- Licence préparatoire ingénieur, Mathématiques et informatiqueUniversité de Rennes I2013Licence préparatoire ingénieur, Mathématiques et informatique