reliability of systems and services to meet the needs of the business. This is achieved through collaboration with the development... as they arise, deploy and maintain observability systems and pipelines, enhance operations and support for services and platforms...
efficiency and enable better decision-making. Booz Allen is the leading provider of AI services to the nation—we’re..., Grafana, or Prometheus Experience leading technical initiatives and influencing organizational change Experience...
, Percona, Postgres, Prometheus, Pulsar, Puppet, Python, Rancher, React.js, Redis, RESTful Web Services, SaltStack, Scala..., Recommender Services, Relational Databases, Role Based Access Control (RBAC), Scripting, Service Oriented Architecture (SOA...
and monitoring tools (Airflow, Datadog, Databand, DataFold, Monte Carlo, Prometheus, Grafana) Strong incident management and root... cause analysis capabilities Customer-oriented mindset with focus on delivering high-quality support services and attention...
Lugar:
Atlanta, GA | 14/02/2026 23:02:34 PM | Salario: S/. $116000 - 152250 per year | Empresa:
FanDuel+) / Spring Boot applications and backend services Design reusable components, API frameworks, and microservices aligned... reliability, and observability tooling (Grafana, Prometheus, Dynatrace, etc.) Team Leadership & People Management Manage...
) along with monitoring tools (e.g., Prometheus and Grafana);monitor message queues and services to ensure optimal operations and identify... in a DoD, federal, or large‑scale enterprise environment. Experience with SIEM/SOAR platforms (Splunk, Elastic, Azure Sentinel...
with Flink Experience with SQL, Kafka Streams, or other stream processing frameworks. Exposure to monitoring tools (Prometheus..., Vision, Dental, 401K, and EAP (Employee Assistance Program) services. Nesco Resource provides equal employment...
Overview: AMERICAN SYSTEMS is an employee-owned federal government contractor supporting national priority programs..., CIS Benchmarks). Manage system services, networking, access controls, logging, and system monitoring on Linux platforms...
, security, governance, and model risk management across ML services. Lead design and implementation of models across classical... evaluation frameworks. Observability: Prometheus/Grafana, OpenTelemetry;SLO-driven operations and incident management. Model...
foundation for the customer's AI capabilities, focusing on inference services while supporting the boarder ecosystem... inference at scale. Support the development and maintenance of production AI services and applications, including retrieval...