Learning Engineer to redefine how we operate our global services. You won't just be building dashboards;you will be building... ecosystem where Multi-Agent Systems and Reinforcement Learning (RL) loops work in tandem with Large Language Models (LLMs...
backend services and APIs that connect environment authoring tools, data collection systems, and RL training infrastructure... reinforcement learning, pioneering fundamental RL research for large language models, and building scalable training methodologies...
, RL, and aligning large language and multimodal models, in addition to keeping up-to-date to the latest progress... with different teams across AMD. KEY RESPONSIBILITIES: Train, finetune, and RL for LLMs/LMMs. Improve on the state-of-the-art...
control plane and pair it with the full RL post-training stack: environments, secure sandboxes, verifiable evals..., and our async RL trainer. We enable researchers, startups and enterprises to run end-to-end reinforcement learning at frontier scale...
, IL, RL, and other variables Must have excellent written and verbal skills Proficient soldering skills (J-STD-001... leave laws, business travel services;employee discounts;and an employee assistance program that includes company paid...
compute into a single control plane and pair it with the full rl post-training stack: environments, secure sandboxes..., verifiable evals, and our async RL trainer. We enable researchers, startups and enterprises to run end-to-end reinforcement...
Lugar:
USA | 30/01/2026 01:01:56 AM | Salario: S/. No Especificado | Empresa:
Protocol Labs, and high-performance training infrastructure for RL, SFT, and more. We enable researchers, startups and enterprises to run end... intersection of cutting-edge RL/post-training methods and applied agent systems. You'll have a direct impact on shaping...
compute into a single control plane and pair it with the full rl post-training stack: environments, secure sandboxes..., verifiable evals, and our async RL trainer. We enable researchers, startups and enterprises to run end-to-end reinforcement...
GTM motion that brings compute, RL infrastructure, and post-training services to AI labs, research orgs, and high-growth... into a single control plane and pair it with the full RL post-training stack: environments, secure sandboxes, verifiable evals...
/tools Experience with RL/bandits, preference optimization, or human feedback loops for personalization. Experience... businesses, municipalities and non-profits. You'll support the delivery of award winning tools and services that cover...