Principal Site Reliability Engineer
ability to maintain scalability and resiliency in complex environments. Familiarity with modern monitoring tools (e.g., Datadog...
ability to maintain scalability and resiliency in complex environments. Familiarity with modern monitoring tools (e.g., Datadog...
tools (Prometheus, Grafana, Datadog etc) Experience in distributed systems and scaling. Knowledge and hands...
, Grafana, Datadog etc) Experience in distributed systems and scaling Knowledge and hands-on experience with multiple...
/Lambda/SQS/SNS/S3), databases (AWS RedShift/Dynamo/Aurora), logging tools (AWS CloudWatch, DataDog), Python, and AWS CDK.../Lambda/SQS/SNS/S3), Databases Knowledge (AWS RedShift/Dynamo/Aurora), Logging (AWS CloudWatch, DataDog), Python, AWS CDK...
scripting skills: PowerShell (preferred), Bash, or Python. Hands-on experience with monitoring solutions such as Datadog...
with BI tools (SQL, Looker, Snowflake, Datadog, Excel/Sheets) and excellent presentation skills. Prior experience in a fast...
. Experience operating one or more of Kafka, Redis, Atlassian Suite, Elastic, Datadog, ActiveBatch or Control-M Preferred...
ability to maintain scalability and resiliency in complex environments. Familiarity with modern monitoring tools (e.g., Datadog...
CI/CD pipelines, observability with Datadog, and adopt SRE practices to keep us running smoothly. This isn’t just..., Helm, and Datadog. Proficiency in programming languages such as a Golang, Python, or JavaScript. Excellent communication...
/ AI would be an advantage Experience administering multiple monitoring systems such as Datadog, NewRelic, Kubernetes, Grafana and Elastic Cloud...