Models (LLMs) and AI applications. This role combines deep technical expertise in cloud platforms, container orchestration... Design and implement backup, recovery, and business continuity plans for ML platforms Technical Leadership & Mentoring...
businesses. About the role As a a Member of Technical Staff specializing in Research Engineer - LLM Systems & Performance... and optimize GPU kernels using tools like CUDA or Triton, and leverage techniques such as FlashAttention and Tensor Cores...
technical, hands-on role: you'll design and build systems that power real-time predictions across millions of requests... per second, tackling challenges in reliability, efficiency, and cost-aware scaling. Success in this role requires both technical...
about advancing the frontiers of AI. You thrive in open-source environments, enjoy tackling complex technical challenges..., Verl, Ray, Deep Graph Library, FlashInfer, Triton Inference Server, Coding Agent, and other trending open-source projects...
standards. Independently evaluates, selects and applies a variety of technical techniques, procedures, and criteria using... generation and processing codes, e.g., APOLLO, CASMO, HELIOS, TRITON (NEWT), Polaris, GenPMAXS Experience with 3D neutronics...
broader technology stack. 4. Cross-Functional Leadership & AI Mentoring Serve as a key technical advisor for C-level... (Neo4j, Amazon Neptune, Apache Jena). Real-time Inference Infrastructure: NVIDIA Triton Inference Server, Ray Serve...
breakthroughs in machine learning and data science. A Solutions Architect is the first line of technical expertise between NVIDIA...) and generative AI workloads. Enhance performance tuning using TensorRT/TensorRT-LLM, vLLM, Dynamo, and Triton Inference Server...
Lugar:
Redmond, WA | 17/01/2026 20:01:40 PM | Salario: S/. $124000 - 195500 per year | Empresa:
Nvidia and contribute to innovations that enhance model performance and deployment. Discover/solve impactful technical problems, advance... through an inclusive environment. Bachelor's Degree in Computer Science or related technical field AND 6+ years technical...
, and ensure execution. KEY RESPONSIBILITIES: Cluster Bring-up & Optimization: Oversee the technical onboarding of massive GPU...) and proprietary models for our specific hardware topology, utilizing tools like vLLM and TensorRT-LLM. Executive Technical...
if: An exceptional track record of high-quality technical output, and a bias for shipping a prototype now and iterating later in the... for clean extensibility. Experience writing Triton, CUDA, or similar, and an understanding of the resulting mapping of tensor...