architecture search, and streamlined deployment strategies with open-sourced inference frameworks. Seeking a Senior Deep Learning... development with CUDA and Triton. This role offers a unique opportunity to work at the intersection of research and engineering...
. Our Infrastructure powers a wide gamut of services at Apple including Apple Search, Apple Music, AppleTV, AppStore, iMessages, Photos.... Familiarity with Nvidia TensorRT-LLM, vLLLM, DeepSpeed, Nvidia Triton Server etc. Pay & Benefits At Apple, base pay...
architecture search, and streamlined deployment strategies with open-sourced inference frameworks. Seeking a Senior Deep Learning..., including custom kernel development with CUDA and Triton. This role offers a unique opportunity to work at the intersection...
, autoscaling, service mesh, GPU operators) and LLM serving engines (e.g., vLLM, TensorRT-LLM, Triton, KServe/Seldon, Ray Serve...). Experience with vector databases and RAG components (e.g., Azure AI Search, Pinecone, Weaviate, Milvus), and feature stores (e.g...