Data Engineering Lead
maintenance, backups, and disaster recovery. Stay current with industry trends, exploring and prototyping new technologies...
maintenance, backups, and disaster recovery. Stay current with industry trends, exploring and prototyping new technologies...
technologies. Experience using measures such as DORA metrics, release frequency, incident response, and system recovery to drive...
& Reporting – Conduct regular progress reviews with EPC contractors, focusing on critical path recovery and dashboard-based status...
tolerance, redundancy, and disaster recovery. Security & Compliance: Enforce secure coding practices, vulnerability management...
with the Agentic Framework Architect. Fault Isolation, Rollback & Recovery Engineering Design deterministic rollback... and checkpointing mechanisms to restore stable orchestration states after failure and enable automatic recovery paths for misaligned...
. Validate disaster recovery: backups, restores, and recovery tests. Optimize cloud cost and performance;implement autoscaling...
based on severity and impact. Lead incident response actions, including containment, eradication, and recovery. Coordinate...
InfoSec team on the vulnerability management. Performing system backups and recovery. Documenting and automating daily tasks...
conditions within agent workflows. Fault Isolation, Rollback & Recovery Architect deterministic rollback, checkpointing..., and recovery mechanisms for multi-agent systems. Design fault-isolation boundaries to prevent local failures from cascading...
, and failure recovery mechanisms Solid understanding of distributed systems and production reliability Desired Qualifications...