Erasure Coding Scheme for Data Protection and Recovery

Describes the erasure coding (EC) schemes for data protection and recovery.

Erasure coding (EC) is a data protection method in which data is broken into fragments, expanded and encoded with redundant data pieces, and stored across a set of different locations or storage media.

EC ensures that if data becomes corrupted, it can be reconstructed using information about the data that is present elsewhere.

The time required to reconstruct data depends on the number of data fragments in the chosen EC scheme, and the number of failures that have occurred. For example, reconstruction of EC scheme 10+2 takes longer compared to the reconstruction of EC scheme 3+2, as a larger number of data blocks must be read.

Considerations When Selecting an EC scheme

As an administrator, consider the following points when selecting an EC scheme:

  • How many nodes can you afford to have?
  • How long are you willing to wait for a node to be rebuilt?
  • How many failures might occur? Do you anticipate a single failure, or multiple failures?

EC Schemes

In an erasure coded volume, an erasure coding scheme has the stripe layout m+n. The stripe is an array of m data fragments and n parity fragments with a fragment size of 4 MB.

Each stripe is created by the same number of data fragments from all containers in the group of EC containers. Each container is placed on a different physical node.

  • m is the number of data fragments.
  • n is the number of redundant fragments (referred to as parity fragments).
  • The parity is calculated using data from all data fragments.
  • m/(m+n) is the encoding rate.
  • m+n is the number of encoded fragments.
  • You need to read a minimum of m blocks to recover data.
  • You can recover data from a maximum of n failures.

For example, assume m=4, n=2, and stripe depth=4 MB.

  • The number of data fragments is four (4), and while the number of parity fragments is two (2).
  • The number of encoded fragments is six (6).
  • The stripe size is 16 MB (4x4 MB) of user data, and 8 MB (2x4 MB) of parity fragments.
  • The system can handle two (2) failures, and any chunk can be recovered from four (4) other chunks.
Requirements for using an erasure coding scheme
  • The number of data fragments must be 3 to 10.
  • The number of parity fragments must be greater than 1 and less than the number of data fragments.
  • The number of data fragments must be greater than the number of parity fragments.
  • The number of nodes must be greater than or equal to the sum of data and parity fragments.

Select from the following schemes for erasure-coded volumes:

EC Scheme Number of Data Nodes Number of Parity Nodes Total Number of Nodes Needed Number of Failures Recoverable Number of Nodes to Read to Recover Data
3+2 3, 2 3 2 5 2 3
4+2 4, 2 4 2 6 2 4
5+2 5, 2 5 2 7 2 5
6+3 6, 3 6 3 9 3 6
10+<x> where x is a value from 1 to 9 N.A 10 x 10+x x 10

Although you can create a volume without the required number of nodes for a specific scheme, volume offload fails if the required number of nodes are not present.

When choosing the scheme, note that more nodes leads to longer recovery time, resulting in degraded performance, network expense, and lengthy time to rebuild.
Important: The recommended number of nodes required for erasure coding is M+2N (rather than M+N) to ensure MapR self-healing and proper operation after N failures. N failures with only M+N nodes allows you to continue reading the data, but manual intervention is required to protect the data from further failures. Currently, data will not be erasure coded if only M nodes are available. With M+2N nodes, N failures will self-heal with no operator intervention.