Senior Manager, Site Reliability Engineering
& Observability Proficient with modern monitoring, logging, and telemetry tools including: o New Relic, Splunk, ELK, Datadog...
& Observability Proficient with modern monitoring, logging, and telemetry tools including: o New Relic, Splunk, ELK, Datadog...
& Infrastructure CI/CD experience (Jenkins/GitHub Actions), Docker, Kubernetes Leverage Datadog or similar tools for application...
, and private environment design Experience implementing LLMOps and observability tooling such as LangSmith, OpenTelemetry, DataDog...
, Datadog, Elastic Cloud alternatives For applications and inquiries, contact: [email protected]...
, Prometheus, Datadog, or Splunk. Site Reliability Engineering (SRE) experience. Experience supporting large-scale enterprise...
with observability platforms such as Prometheus, Grafana, ELK, Splunk, or Datadog Outcome / Business Objective Deliver a production...
for model updates with automated quality gates. Observability: logging, tracing, and metrics for AI services (Datadog...
. Strong troubleshooting skills in routing/switching environments, especially with VRF and VXLAN. Experience with tools such as Datadog...
Kubernetes and Docker CI/CD pipelines Infrastructure as Code (Terraform or similar) Observability tools (Datadog, Prometheus...
required for this role? 1. k8s admin 2. AWS fundamentals 3. Splunk / datadog Skill Metrics AWS EC2 Splunk Jenkins Linkerd... GitOps Datadog AWS VPC dynatrace AWS IAM Idaas Kubernetes Agile DevOps Kubernetes Administrator (GKE) AWS EKS...