Site Reliability Engineer
Preferred Familiarity with observability tools (Prometheus, Grafana, Application Insights, Datadog, Splunk) 4 Preferred...
Preferred Familiarity with observability tools (Prometheus, Grafana, Application Insights, Datadog, Splunk) 4 Preferred...
release cycles. Set up monitoring, performance dashboards, and alerts using CloudWatch, Performance Insights, Datadog, New...). Strong scripting skills (Python, Bash) and experience with DevOps/automation. Familiarity with monitoring and logging tools (Datadog...
, Prometheus, Grafana, Datadog, or ExtraHop Collaborate with cloud, platform, and operations teams to support workloads running... with enterprise monitoring/observability tools (e.g., Dynatrace, Splunk, ExtraHop, Prometheus, Grafana, Datadog) 3+ years...
, Grafana, Application Insights, Datadog, Splunk) 4 Preferred Experience operating 24x7 production environments with on-call...
and observability tools (New Relic, Datadog) AI Integration Experience (Bonus) Experience building UI for AI-powered features or LLM...
, Jenkins, Terraform, Ansible, OpenShift, and Kubernetes. Experience with monitoring platforms including GCP, Datadog, Thousand...
that is customer focused and quality driven. Proficiency with log monitoring & analytics tools - Datadog, Dynatrace, Pagerduty, ELK...
observability-metrics, logs, traces, dashboards, alerting-using tools such as Datadog, Prometheus, Grafana, CloudWatch...
and Kafka Monitoring: Datadog, PagerDuty, Sentry Version Control: Github, PagerDuty Projects we're working on: At Headway...
/OpenSearch, Datadog) Experience implementing deployment strategies (blue/green, canary, rolling updates) Strong troubleshooting...