AI Performance Engineer

, and speculative decoding for LLM serving. Drive compiler-level optimizations using Triton, XLA, TorchInductor, or TVM, working..., TensorRT-LLM, DeepSpeed, or similar projects. Familiarity with custom kernel authoring in Triton or CUTLASS. Experience...

Lugar: Monroe Township, NJ | 26/06/2026 01:06:48 AM | Salario: S/. $100000 - 150000 per year | Empresa: Bright Vision Technologies

AI Reliability Engineer (SRE) for Gen AI

infrastructure and AI workloads (e.g., Triton Inference Server monitoring). Preferred Qualifications Background in implementing...| Prometheus| OpenTelemetry) applied to both standard infrastructure and AI workloads (e.g.| Triton Inference Server monitoring...

Lugar: Tampa, FL | 26/06/2026 00:06:55 AM | Salario: S/. No Especificado | Empresa: Artech Information Systems

SR Principal Software Engineer - LLM Engineering

servers such as Triton Inference Server and vLLM for high-throughput, low-latency serving at scale. Oversees production... in large-scale environments typical of major tech firms. Hands-on experience building LLM inference engines using Triton...

Lugar: Palo Alto, CA | 25/06/2026 17:06:05 PM | Salario: S/. No Especificado | Empresa: JPMorgan Chase