Description

AI Reviewer available immediately for RLHF & AI evaluation projects — Remote worldwide

Hi, I’m Stéphane — an AI evaluation specialist with 3.5 years of experience in content moderation and policy enforcement for a major social media platform.

I specialize in high-precision decision-making, safety analysis, and complex evaluation frameworks — now applied to AI training and LLM evaluation.

🔍 What I bring to your project:

• Experience working on high-scale moderation systems with strict accuracy and SLA requirements

• Expertise in policy-based evaluation and nuanced decision-making

• Proven ability to maintain high accuracy (95%+) in high-volume environments

• Strong analytical skills for detecting edge cases, inconsistencies, and hidden risks

• Deep understanding of content safety, compliance, and contextual classification

🤖 AI & LLM experience:

• AI response evaluation (quality, safety, factuality, relevance)

• RLHF tasks (ranking, comparison, prompt evaluation)

• Data annotation & labeling (text, image, video, audio)

• Multimodal analysis (image-text coherence, contextual alignment)

🎯 Focus areas:

I’m particularly interested in advanced AI projects involving:

• LLM testing & evaluation

• Safety and alignment

• Complex annotation workflows

💬 Why work with me:

• Reliable, detail-oriented, and fast learner

• Strong consistency in long-term projects

• Clear communication and professional delivery

💰 Rate: ~$200/day (~$27/hour), flexible depending on project scope.

Available for freelance missions — flexible depending on project needs.

Langues

Portugais
Bilingue ou natif
Français
Bilingue ou natif
Anglais
Capacité professionnelle complète
Espagnol
Notions

Préférences en matière de lieu de travail

En télétravail uniquement

Travaille majoritairement à distance

Outlier
AI Trainer / RLHF Evaluator
HIGH TECH
mars 2026 - Aujourd'hui (3 mois)
Contributed to the evaluation and optimization of large language models (LLMs) through high-level Reinforcement Learning from Human Feedback (RLHF) workflows, supporting the development of reliable and production-ready AI systems.

🔍 Core Contributions:

• Performed in-depth evaluation of AI-generated outputs, assessing accuracy, coherence, safety, and contextual relevance.
• Ranked and compared model responses to improve alignment, quality, and user-facing performance.
• Executed advanced data annotation across text and multimodal datasets, including image-text coherence validation.
• Conducted prompt evaluation and stress-testing to identify edge cases, inconsistencies, and potential failure modes.

⚙️ Methodology & Expertise:

• Applied rigorous evaluation frameworks and complex guidelines to ensure consistency and scalability.
• Demonstrated strong analytical judgment in identifying subtle errors, risks, and nuanced contextual issues.
• Contributed to high-quality datasets used for training and refining AI systems.

🎯 Specialization:
• AI Quality Evaluation • RLHF • LLM Testing • Multimodal Validation • AI Safety

💼 Focused on delivering high-precision AI evaluation for advanced and production-level AI systems.
RLHF Trainer (Reinforcement Learning from Human Feedback) AI Trainer - Data Annotator Multimodal Evaluator Multilingual evaluator Fast-Checking
Accenture
CONTENT MODERATOR
RÉSEAUX SOCIAUX
novembre 2022 - Aujourd'hui (3 ans et 7 mois)
Lisbon, Portugal
Ensured the quality, safety, and compliance of user-generated content for a major social media platform, operating in high-volume and high-stakes environments.

🔍 Key Contributions:

• Reviewed and evaluated large volumes of content (text, image, video, audio) to enforce platform policies and safety standards.
• Applied advanced contextual judgment to classify complex and ambiguous cases, including sensitive and high-risk content.
• Identified, flagged, and escalated critical issues, ensuring rapid and accurate decision-making.

⚙️ Performance & Impact:

• Achieved 95% evaluation accuracy (target: 90%), demonstrating strong analytical precision and consistency.
• Recognized as Top Performer (French Market) for two consecutive months (2025), based on quality and reliability metrics.

🎯 Core Strengths:

• Policy-based evaluation & complex decision frameworks.
• Risk detection, edge case analysis & content safety.
• High attention to detail in fast-paced environments.

💡 This experience directly supports AI evaluation tasks such as RLHF, LLM response assessment, and multimodal data analysis.
Multimodal Evaluator Portuguese French Safety Evaluation Quality Auditing