Previous page

Next page

Locate page in Contents

Print this page

Recommendations on Failure Domains

  • For the flexibility of Parallels Cloud Storage allocator and rebalancing mechanisms, it is always recommended to have at least 5 failure domains configured in a production setup (hosts, racks, etc.). Reserve enough disk space on each failure domain so if a domain fails it can be recovered to healthy ones.
  • When MDS services are created, the topology and failure domains must be taken into account manually. That is, in multi-rack setups, metadata servers should be created in different racks (5 MDSes in total).
  • At least 3 replicas are recommended for multi-rack setups.
  • Huge failure domains are more sensitive to total disk space imbalance. For example, if a domain has 5 racks, with 10 TB, 20 TB, 30 TB, 100 TB, and 100 TB total disk space, it will not be possible to allocate (10+20+30+100+100)/3 = 86 TB of data in 3 replicas. Instead, only 60 TB will be allocatable, as the low-capacity racks will be exhausted sooner, and no 3 domains will be available for data allocation, while the largest racks (the 100TB ones) will still have free space
  • If a huge domain fails and goes offline, Parallels Cloud Storage will not perform data recovery by default, because replicating a huge amount of data may take longer than domain repairs. This behavior managed by the global parameter mds.wd.max_offline_cs_hosts (configured with pstorage-config ) which controls the number of failed hosts to be considered as a normal disaster worth recovering in the automatic mode
  • Failure domains should be similar in terms of I/O performance to avoid imbalance. For example, avoid setups in which failure-domain is set to rack , all racks but one have 10 Nodes each and one rack has only 1 Node. Parallels Cloud Storage will have to repeatedly save a replica to this single Node, reducing overall performance
  • Depending on the global parameter mds.alloc.strict_failure_domain (configured with pstorage-config ), the domain policy can be strict (default) or advisory. Tuning this parameter is highly not recommended unless you are absolutely sure of what you are doing.