Principal Software Engineer
technologies. Experience using measures such as DORA metrics, release frequency, incident response, and system recovery to drive...
technologies. Experience using measures such as DORA metrics, release frequency, incident response, and system recovery to drive...
& Reporting – Conduct regular progress reviews with EPC contractors, focusing on critical path recovery and dashboard-based status...
with the Agentic Framework Architect. Fault Isolation, Rollback & Recovery Engineering Design deterministic rollback... and checkpointing mechanisms to restore stable orchestration states after failure and enable automatic recovery paths for misaligned...
tolerance, redundancy, and disaster recovery. Security & Compliance: Enforce secure coding practices, vulnerability management...
. Validate disaster recovery: backups, restores, and recovery tests. Optimize cloud cost and performance;implement autoscaling...
based on severity and impact. Lead incident response actions, including containment, eradication, and recovery. Coordinate...
InfoSec team on the vulnerability management. Performing system backups and recovery. Documenting and automating daily tasks...
, you will: Positively contribute to the team responsible for delivering a high-quality recovery collections service. Lead and coordinate... digitalization and automation. iii) Provide data-driven insights to optimize recovery strategies and operational efficiency...
conditions within agent workflows. Fault Isolation, Rollback & Recovery Architect deterministic rollback, checkpointing..., and recovery mechanisms for multi-agent systems. Design fault-isolation boundaries to prevent local failures from cascading...
, and failure recovery mechanisms Solid understanding of distributed systems and production reliability Desired Qualifications...