NCCL, RDMA, and high-performance networking. Implement custom operators and fused kernels in PyTorch, JAX, or Triton... Experience with Triton, CUTLASS, or other GPU kernel authoring frameworks. Familiarity with TensorRT, FasterTransformer, or vLLM...
, FlashInfer etc., would be well suited. - Familiar with syntax and tile-level semantics similar to Triton. - Experience...
Lugar:
Seattle, WA | 27/06/2026 00:06:59 AM | Salario: S/. No Especificado | Empresa:
Amazon NCCL, RDMA, and high-performance networking. Implement custom operators and fused kernels in PyTorch, JAX, or Triton... Experience with Triton, CUTLASS, or other GPU kernel authoring frameworks. Familiarity with TensorRT, FasterTransformer, or vLLM...
(vLLM, Triton) Drive model risk management: explainability, audit trails, evaluation pipelines Create architecture...
NCCL, RDMA, and high-performance networking. Implement custom operators and fused kernels in PyTorch, JAX, or Triton... Experience with Triton, CUTLASS, or other GPU kernel authoring frameworks. Familiarity with TensorRT, FasterTransformer, or vLLM...
, and speculative decoding for LLM serving. Drive compiler-level optimizations using Triton, XLA, TorchInductor, or TVM, working..., TensorRT-LLM, DeepSpeed, or similar projects. Familiarity with custom kernel authoring in Triton or CUTLASS. Experience...
, and speculative decoding for LLM serving. Drive compiler-level optimizations using Triton, XLA, TorchInductor, or TVM, working..., TensorRT-LLM, DeepSpeed, or similar projects. Familiarity with custom kernel authoring in Triton or CUTLASS. Experience...
delivery of operational flood forecasting capabilities in a remote work environment. This role ensures that TRITON and related...
(NGC) Development Triton Lead Track and report detailed test metrics to NGC leadership and ensure unified ITT metrics...
. This position ensures that the TRITON model delivers scientifically accurate, operationally relevant, and computationally efficient...