I a PhD machine learning scientist and graduate from the Ecole Normale Supérieure of Paris with 7 years' of experience applying AI to academic research and real-world problems. I am used to working with difficult data (e.g. terabyte-sized, extremely imbalanced or dirty datasets) spanning a variety of industries, including financial timeseries; highly sensitive, anonymized health records; and noisy, low-resolution video. I have experience working with tasks covering fraud detection, predictive maintenance, rare disease diagnosis, video processing, and customer sentiment analysis.
I do most of my ML work in Python, SQL, and Julia, and implement models using TensorFlow, SKLearn, and a handful of common libraries such as XGBoost, LightGBM, and CatBoost. I am adamant about following software engineering best practices: all production code must be modular, unit-tested, documented, human-readable, and adhere to the Google coding standards.
Having spent most of my career doubling as a data engineer, I am also well-versed in setting up, expanding, and maintaining data mining, ETL, feature engineering, and warehousing pipelines using PySpark and Dask frameworks. I also have a strong experience in working with real-time data streaming pipelines using Kafka, Faust and ClickHouse.
Due to my past experience working as a technical consultant for major management consulting firms, I am a clear and concise communicator, and welcome client-facing activities. I particularly enjoy working in diverse, small-to-medium-sized teams alongside colleagues hailing from different backgrounds and fields of expertise.