Senior Site Reliability Engineer (SRE)
, Application Insights, Datadog, Splunk): 4 years. Experience operating 24x7 production environments with on-call rotations: 4...
, Application Insights, Datadog, Splunk): 4 years. Experience operating 24x7 production environments with on-call rotations: 4...
, including automated deployment of Datadog monitors, dashboards, and alerting through Terraform Collaborate with development... GitHub Actions or comparable tools 1+ years of experience deploying and managing observability systems, including Datadog...
SLOs and Datadog observability. You'll also improve CI/CD pipeline standards, accelerate test-failure triage.../GCP (EKS/ECS/Cloud Run/GKE) Implement observability-by-default using Datadog (instrumentation, dashboards, alerting...
development;Setting up monitoring and alerts using Geneos Dashboards, AWS CloudWatch, and Datadog;Maintaining documentation...
with observability tools (Prometheus, Grafana, Datadog, Splunk, etc.) Experience supporting 24x7 production environments and on-call...
with observability tools (Prometheus, Grafana, Datadog, Splunk, etc.) Experience supporting 24x7 production environments and on-call...
platforms (AWS, Azure, GCP). Experience with logging/monitoring tools (CloudHub logs, Splunk, Datadog, AppDynamics, etc.)....
using tools such as Grafana, Dynatrace, Prometheus, Datadog, Splunk, etc. Advanced knowledge of software applications...
using Prometheus, Datadog Splunk and CoLead incident management troubleshooting root cause analysis (RCA), and performance... or MySQLExperience implementing database monitoring and observability tools Prometheus Datadog Splunk CloudWatchStrong understanding...
/OpenSearch, Datadog) Experience implementing deployment strategies (blue/green, canary, rolling updates) Strong troubleshooting...