Identity Security Architect
: Hardened experience protecting live AI inference pipelines built on TensorRT-LLM and Triton Inference Server. Cloud-Native...
: Hardened experience protecting live AI inference pipelines built on TensorRT-LLM and Triton Inference Server. Cloud-Native...
systems Experience with one or more inference serving frameworks, including NVIDIA Triton/Dynamo, TorchServe, or similar...
NCCL, RDMA, and high-performance networking. Implement custom operators and fused kernels in PyTorch, JAX, or Triton... Experience with Triton, CUTLASS, or other GPU kernel authoring frameworks. Familiarity with TensorRT, FasterTransformer, or vLLM...
optimization, continuous batching, and speculative decoding for LLM serving. Drive compiler-level optimizations using Triton, XLA... in Triton or CUTLASS. Experience with FinOps for AI workloads. Publications or talks on AI systems performance...
-enabled solutions company. Adjacent experience with companies such as GCX, Ergotron, Triton, Capsa Healthcare, Hatchmed, Lily...
, capabilities, and skills Foundational understanding of NVIDIA GPU infrastructure software (e.g., DCGM, BCM, Triton Inference...
optimization, continuous batching, and speculative decoding for LLM serving. Drive compiler-level optimizations using Triton, XLA... in Triton or CUTLASS. Experience with FinOps for AI workloads. Publications or talks on AI systems performance...
NCCL, RDMA, and high-performance networking. Implement custom operators and fused kernels in PyTorch, JAX, or Triton... Experience with Triton, CUTLASS, or other GPU kernel authoring frameworks. Familiarity with TensorRT, FasterTransformer, or vLLM...
NCCL, RDMA, and high-performance networking. Implement custom operators and fused kernels in PyTorch, JAX, or Triton... Experience with Triton, CUTLASS, or other GPU kernel authoring frameworks. Familiarity with TensorRT, FasterTransformer, or vLLM...
optimization, continuous batching, and speculative decoding for LLM serving. Drive compiler-level optimizations using Triton, XLA... in Triton or CUTLASS. Experience with FinOps for AI workloads. Publications or talks on AI systems performance...