Staff / Principal ML Ops Engineer
, checkpointing workflows). Lead the implementation of observability for ML systems (monitor drift, performance, throughput...
, checkpointing workflows). Lead the implementation of observability for ML systems (monitor drift, performance, throughput...
practices: evaluation harnesses, monitoring and drift detection, data pipelines, and human-in-the-loop processes. Familiarity...
, checkpointing workflows). Lead the implementation of observability for ML systems (monitor drift, performance, throughput...
, checkpointing workflows). Lead the implementation of observability for ML systems (monitor drift, performance, throughput...
, checkpointing workflows). Lead the implementation of observability for ML systems (monitor drift, performance, throughput...
, checkpointing workflows). Lead the implementation of observability for ML systems (monitor drift, performance, throughput...
, checkpointing workflows). Lead the implementation of observability for ML systems (monitor drift, performance, throughput...
, checkpointing workflows). Lead the implementation of observability for ML systems (monitor drift, performance, throughput...
, and alignment drift. Build platform components and APIs that allow product teams to integrate evaluation seamlessly into training...
regressions, calibration drift, and performance anomalies in close partnership with Ads Infra teams. Drive innovation by staying...