High availability through distributed system design

Description

Simpl-Open shall enable high availability through a distributed system architecture that incorporates redundancy, failover mechanisms and deployment across multiple zones or regions. The system shall be designed to avoid single points of failure and maintain continuous service availability under fault conditions. This applies in particular to all critical platform components whose failure would impact service continuity or core business processes.

SMART Breakdown

Specific: Simpl-Open shall be designed to support distributed deployments across failure domains (e.g. zones, regions), enabling redundancy and continuity in case of failure. High availability shall be explicitly required for all critical platform components, such as core orchestration services, security gateways, identity services, and data access layers.
Measurable: High availability shall be measured by defined architecture-level targets, such as expected uptime (e.g. ≥ 99.9%), presence of redundancy strategies (e.g. active-active or active-passive deployments) and documentation of how services are distributed across failure domains.
Achievable: The architecture shall support load balancing, health checks, and automated failover mechanisms to maintain operational continuity without manual intervention.
Realistic: Distributed system design is a widely adopted practice in large-scale and federated platforms. It enables predictable recovery from failures and reduces system downtime.
Timely: This capability should be incorporated during the design phase and continuously validated through operational testing and performance reviews.

Detailed

Non-Functional Requirement

Issue ID: SIMPL-9940

Status: Proposed

Back to Simpl requirements overview