Jobs

Head – DR Orchestration & Testing at Equity Bank Kenya

  • Job Type Full Time
  • Qualification BA/BSc/HND , MBA/MSc/MA
  • Experience
  • Location Nairobi
  • Job Field ICT / Computer&nbsp

Head – DR Orchestration & Testing at Equity Bank Kenya

Head – DR Orchestration & Testing

A senior leadership role accountable for an enterprise-wide DR orchestration and testing program spanning data centers, cloud, networks, applications, data platforms, and third-party services. The role builds automated runbooks, governs recovery scenarios, executes end-to-end exercises (tabletop to full failover), and drives remediation. It tightly integrates with Change/Release, Backup & Recovery, Cybersecurity, SRE/Operations, and Business Units to assure recoverability for core banking, payments, digital channels, and shared services across all subsidiaries.

Key Accountabilities

  • Strategy, Policy & Governance
  • Define and maintain the Group DR Orchestration & Testing Policy, Standards, and Playbooks aligned to ITIL v4, ISO 22301/27031, and NIST SP 800-34.
  • Institutionalize governance anchored by the IT Steering Committee and the Service Continuity/DR Working Group as the mechanisms for cadence, accountability, and reporting.
  • Establish decision rights, RACI, and acceptance criteria for “go-live” recoverability (RTO/RPO, data integrity, service dependencies).
  • Embed DR impact assessment in Change, Release, and Architecture review gates.
  • Orchestration & Automation
  • Design and implement automated recovery runbooks (e.g., infra, platform, DB, app, network/DNS, identity) leveraging workflow/orchestration tools, Infrastructure-as-Code, and CI/CD.
  • Engineer repeatable failover/failback patterns (active–active, active–standby, zonal/region/site) for on‑prem, hybrid, and cloud workloads.
  • Integrate observability (APM, logs, synthetics) to validate service health during exercises and real events.
  • Testing Program Management
  • Own the Group DR Test Calendar (annual/quarterly/monthly) covering tabletop, technical component tests, integrated service tests, and full-scale exercises.
  • Define test scenarios based on BIAs, risk scenarios (e.g., ransomware, DC outage, carrier failure, major release rollback), and regulatory expectations.
  • Measure and certify recoverability per service; track defects, action owners, and closure SLAs.
  • Data, Backup & Cyber Recovery Assurance
  • Align backup/restore testing with application-level recovery (including immutable/air-gapped copies, vaulting, and key management).
  • Validate data integrity, transaction reconciliation, and journal consistency post-recovery (e.g., core banking, card switch, channels).
  • Coordinate with Cybersecurity on ransomware readiness, clean‑room recovery, and malware‑free restore procedures.
  • Third-Party & Cloud Resilience
  • Assess and test DR commitments of critical vendors/fintech partners; verify evidence of recoverability and exit/failover options.
  • Govern SaaS and cloud region/zone strategies, data residency constraints, and cross‑border implications for subsidiaries.
  • Service Mapping & Readiness
  • Maintain service dependency maps (CMDB) linking business services to applications, platforms, data stores, integrations, and infrastructure.
  • Define minimal viable service (MVS) configurations for recovery and ensure runbooks reflect current state.
  • Metrics, Reporting & Continuous Improvement
  • Define and report KPIs/KRIs: test coverage %, pass rate, RTO/RPO adherence, MTTR
  • (exercises/incidents), % automated runbooks, restore success rate, findings aging, and resilience confidence score.
  • Produce executive dashboards and Monthly/Quarterly Resilience Reports to Group CIO, CFO, Risk, and Executive Committees.
  • Run post-exercise/post-incident reviews and drive structural fixes (automation, design changes, capacity).
  • Subsidiary Coordination & Incident Readiness
  •  Coordinate DR readiness across Banking, Insurance, Fintech, Health, and Foundation; tailor scenarios to local contexts while enforcing Group standards.
  • Lead or support technical recovery command during major incidents and planned DR events.
  • Financial Planning & Value Optimization
  • Quantify cost‑to‑recover vs. risk; recommend right‑sized patterns (active–active vs. warm/cold) by criticality.
  • Support budgeting for resilience tooling, testing, and automation; demonstrate ROI through reduced downtime and faster recovery.

Key Deliverables

  • Group DR Orchestration & Testing Policy, Standards, and Runbook Library.
  • Annual DR Test Calendar with scenario catalog and success criteria.
  • Service-level Recovery Certificates (per critical service) and remediation tracker.
  • Enterprise Resilience Dashboard (RTO/RPO, coverage, pass rate, MTTR, confidence score).
  • Quarterly Executive Resilience Reports and Board-ready summaries.
  • Post-Exercise/Incident Review reports with prioritized corrective actions.
  • Up-to-date Service Dependency Maps and MVS definitions.

 Required Qualifications & Experience

Education

  • Bachelor’s in Computer Science, Engineering, Information Systems, or related field.
  • Master’s in IT Management, Business Continuity/Resilience, or Operations is an advantage.
  • Certifications (Preferred)

Method of Application

Interested and qualified? Go to Equity Bank Kenya on equitybank.taleo.net to apply

Leave a Comment