You're seeing this page as if you were . The main menu is still yours, though. Exit from immersion
Derek LinzDL

Derek Linz

-Sr. Staff Escalation & Reliability Engineer

1 200 €/jour
Rotterdam, NL
8-15 ans

Délai de réponse moyen : 1h

À propos de Derek

I specialize in the hard problems — the ones that survive first-line support and require deep investigation across the hardware/software boundary. With 10+ years in datacenter-scale environments, I've spent my career at Nutanix diagnosing critical failures where compute, storage, networking, and hypervisor layers intersect.
My focus areas include GPU-accelerated infrastructure (NVIDIA driver debugging, VFIO/mdev passthrough, ECC analysis), Linux internals (kernel crash analysis, kdump, driver fault tracing), KVM/AHV virtualization, and NUMA/performance regression work on clustered systems. I've built diagnostic tooling used thousands of times across global customer fleets and worked directly with NVIDIA engineering to trace and resolve driver-level regressions.
I'm available for short or long-term engagements involving escalation support, infrastructure reliability investigations, diagnostic tooling, or performance analysis on complex Linux/GPU environments.
Certifications: RHCSA · Nutanix Certified Master (NCM MCI) · VCP6-DCV/NV
  • Anglais

    Bilingue ou natif

Accepte de travailler sur site
Rotterdam (jusqu’à 50 km)

Expériences

  • Nutanix
    -Sr. Staff Escalation & Systems Reliability Engineer
    HIGH TECH
    janvier 2019 - juillet 2025 (6 ans et 6 mois)
    Amsterdam, Pays-Bas
    • - Performed kernel-level crash-dump analysis on production clusters, isolating failures within NVIDIA's closed-source GPU driver modules; identified a driver regression that triggered firmware-level timeout conditions under specific workloads, and collaborated with Nutanix GPU Engineering and NVIDIA to validate fixes using pre-release driver builds
    • - Troubleshot GPU passthrough (VFIO) and mediated-device (mdev) issues on AHV/KVM, including driver-binding problems, incomplete GPU reset behavior (FLR-related), and mdev provisioning failures due to host/guest driver mismatches
    • - Designed and maintained automated live-boot ISO images embedding diagnostic payloads for GPU and storage nodes; scripts captured telemetry, logs, and performance signatures automatically on boot, reducing triage time from hours to minutes across global customer fleets.
    • - Developed SQL correlation queries on large telemetry datasets to detect configuration dependent failure signatures across customer environments; integrated results into automated workflows that surfaced high-impact issues for engineering and support.
    • - Investigated NUMA locality challenges in Nutanix AHV clusters where Controller VMs are pinned to host cores; validated BIOS configurations to maximize local I/O performance and avoid remote-node latency penalties.
    • - Ran performance benchmarking using Cinebench and Phoronix Test Suite analyzing Nutanix hardware platforms and guest performance across hypervisors (AHV/ESXi), kernel versions, and CPU architectures to identify configuration-dependent regressions.
    • - Created Python and Bash automation used to orchestrate log capture, correlate kernel events, and produce actionable diagnostic summaries for critical customer escalations.
    VMWARE KVM Linux Data visualization Python
  • Nutanix
    Systems Reliability Engineer
    HIGH TECH
    janvier 2016 - janvier 2019 (3 ans)
    Amsterdam, Pays-Bas
    • - Troubleshot cross-layer failures across compute, storage, and networking paths in distributed Nutanix clusters.
    • - Provisioned and managed server hardware in the Support Lab; performed node bring-up, imaging, racking, and platform validation.
    • - Supported engineering teams by validating experimental configurations and identifying systemic reliability issues.
    • - Work directly supported production infrastructure operating at global datacenter scale.
    System administration VMWARE Linux KVM Nutanix
  • VCE
    vPlatform Support Engineer
    HIGH TECH
    juin 2015 - mars 2016 (9 mois)
    Durham, États-Unis
    Provided escalation support for VCE Vblock converged infrastructure, resolving complex customer issues across compute, networking, and storage layers. Authored and presented RCA documents for critical incidents, proactively identified at-risk customers based on problem trends, and mentored junior support engineers. Worked closely with critical accounts and emerging product lines.
    VMWARE Cisco

Recommandations

Soyez le premier à recommander Derek

Contribuez à la réussite de ce freelance en partageant votre expérience de collaboration avec lui.

Ces profils de freelance correspondent également à vos critères

AgathaA

Agatha Frydrych

Backend Java Software Engineer

4.7

(3)

2

BaptisteB

Baptiste Duhen

Fullstack developer

4.6

(4)

5

AmedA

Amed Hamou

Senior Lead Developer

4

(2)

7

AudreyA

Audrey Champion

Web developer

4.3

(3)

4

Formations

  • NVIDIA AI Enterprise Admin
    NVIDIA AI Enterprise Admin
  • RHCSA
    RHCSA

Certifications

Compétences

Catégories