-serving systems such as vLLM, Triton, TGI, SageMaker, Vertex AI, or custom inference services to improve batching, concurrency... of smaller models where appropriate. Experience working with model-serving systems such as vLLM, Triton, TGI, SageMaker, Vertex...
such as PyTorch, SGLang, vLLM and Triton Inference Server. Design robust, production-ready systems for distributed inference... (such as CUDA/ROCm). Kubernetes. Triton Inference Server. SGLang, vLLM or similar LLM serving frameworks...
Lugar:
Cork | 29/05/2026 00:05:43 AM | Salario: S/. No Especificado | Empresa:
Qualcomm, TGI (Text Generation Inference), and NVIDIA Triton, ensuring high performance at scale. Handle deployment optimizations... (e.g., CUDA, Triton kernels). Background in MLOps or SRE roles focused on high-performance AI endpoints and reliability...
Lugar:
Dublin | 08/05/2026 00:05:51 AM | Salario: S/. No Especificado | Empresa:
F5 have: A track record of working on model training or model inference at scale, or on low-level GPU coding (e.g. CUDA, Triton... CUDA or Triton kernels. Comfortable working in production environments at meaningful scale (traffic, data...
Lugar:
Dublin | 18/04/2026 19:04:00 PM | Salario: S/. No Especificado | Empresa:
IntercomTriton, ensuring high performance at scale. Handle deployment optimizations to deliver low-latency AI serving solutions... inference libraries or hardware-level kernel development (e.g., CUDA, Triton kernels). Background in MLOps or SRE roles...
Lugar:
Dublin | 17/04/2026 21:04:07 PM | Salario: S/. No Especificado | Empresa:
F51