Senior Inference Engineer - AI Infrastructure
, or a comparable language. Strong understanding of GPU software stacks (CUDA, Triton, NCCL) and Kubernetes orchestration. Practical...
, or a comparable language. Strong understanding of GPU software stacks (CUDA, Triton, NCCL) and Kubernetes orchestration. Practical...
/HIP, Triton, or similar). Hands-on with SFT. LoRA and RL-based training at scale. Strong PyTorch experience (torch...
across training and inference workloads. Configure and manage NVIDIA Triton Inference Server for multi-model serving, dynamic... planning, and failure recovery. Practical MLOps experience: model serving infrastructure (Triton or equivalent), experiment...
servers such as Triton Inference Server and vLLM for high-throughput, low-latency serving at scale. Oversees production... in large-scale environments typical of major tech firms. Hands-on experience building LLM inference engines using Triton...
AI workloads using NVIDIA technologies including CUDA-X libraries, TensorRT-LLM, Triton Inference Server, NVIDIA NeMo, NIM... such as TensorRT-LLM, Triton Inference Server, CUDA, RAPIDS, or similar GPU acceleration technologies. Experience building or scaling...
efficiency. Leads deployment and optimization using Model Inference servers such as Triton Inference Server and vLLM for high... environments typical of major tech firms. Hands-on experience building LLM inference engines using Triton Inference Server...
) and inference serving frameworks (e.g., vLLM, Triton Inference Server, TensorRT-LLM, ONNX Runtime, Ray Serve, DeepSpeed-MII). 3...+ years of experience in GPU programming and optimization, with expert knowledge of CUDA, ROCm, Triton, PTX, CUTLASS...
performance for AI operations, leveraging tools like Compute Kernel (CK), CUTLASS, and Triton for multi-GPU and multi-platform...
. Foundational understanding of NVIDIA GPU Infrastructure software (e.g., NVIDIA DCGM, BCM, Triton Inference), Kubernetes and Cloud...
with inference optimization using vLLM, TensorRT-LLM, Triton Inference Server, or similar DevOps & Platform Skills Advanced...