GPU Software Engineer (CUDA)

NCCL, RDMA, and high-performance networking. Implement custom operators and fused kernels in PyTorch, JAX, or Triton... Experience with Triton, CUTLASS, or other GPU kernel authoring frameworks. Familiarity with TensorRT, FasterTransformer, or vLLM...

Lugar: Plainsboro, NJ | 27/06/2026 00:06:49 AM | Salario: S/. $100000 - 150000 per year | Empresa: Bright Vision Technologies

GPU Software Engineer (CUDA)

NCCL, RDMA, and high-performance networking. Implement custom operators and fused kernels in PyTorch, JAX, or Triton... Experience with Triton, CUTLASS, or other GPU kernel authoring frameworks. Familiarity with TensorRT, FasterTransformer, or vLLM...

Lugar: McKinney, TX | 27/06/2026 00:06:37 AM | Salario: S/. $100000 - 150000 per year | Empresa: Bright Vision Technologies

GPU Systems Engineer (CUDA)

NCCL, RDMA, and high-performance networking. Implement custom operators and fused kernels in PyTorch, JAX, or Triton... Experience with Triton, CUTLASS, or other GPU kernel authoring frameworks. Familiarity with TensorRT, FasterTransformer, or vLLM...

Lugar: Norcross, GA | 26/06/2026 20:06:05 PM | Salario: S/. $100000 - 150000 per year | Empresa: Bright Vision Technologies

AI Performance Engineer

, and speculative decoding for LLM serving. Drive compiler-level optimizations using Triton, XLA, TorchInductor, or TVM, working..., TensorRT-LLM, DeepSpeed, or similar projects. Familiarity with custom kernel authoring in Triton or CUTLASS. Experience...

Lugar: McKinney, TX | 26/06/2026 20:06:03 PM | Salario: S/. $100000 - 150000 per year | Empresa: Bright Vision Technologies

AI Performance Engineer

, and speculative decoding for LLM serving. Drive compiler-level optimizations using Triton, XLA, TorchInductor, or TVM, working..., TensorRT-LLM, DeepSpeed, or similar projects. Familiarity with custom kernel authoring in Triton or CUTLASS. Experience...

Lugar: Norcross, GA | 26/06/2026 18:06:59 PM | Salario: S/. $100000 - 150000 per year | Empresa: Bright Vision Technologies