Architecture
Cassandra uses so many components to build upon its FRPSOHx DUFhLWHFWuUDO WhHRUy WhDW LW Ls UHDOOy GLIfiFuOW WR JR through all the bits and pieces without missing anything. The terminologies discussed here are those that provide an insight into the inner workings of this database. Cassandra's architecture is built more towards avoiding a single point of failure in the cluster, so as to have unhindered access to the maximum amount of data in case any part of the cluster fails. It uses technologies that resemble peer-to-peer networking to achieve a failure-proof data distribution model. Hence, no single node in a Cassandra cluster can be termed as a master of others, and coordination among the nodes is achieved with the help of the Gossip failure detection protocol, which is used to determine the availability of a node in the cluster. Gossip is managed with the help of the Gossiper present on each node, which keeps on initiating ‘Gossip’ communications periodically with random nodes to check their availability. As a result, each node performs similar functions to others, and there are no designated roles for a particular function.
Each node in Cassandra is part of a ring, which determines the way in which the topology of a Cassandra cluster is represented. Each node in the cluster is assigned a token, and a part of the data for which it is responsible. The data to be assigned to each node is determined by the partitioner, which allows the row keys to be sorted according to the partitioning strategy chosen. The default strategy is random partitioning, which works on the basis of consistent hashing to distribute row keys. Another partitioning strategy available is the use of Byte-Ordered Partitioner, which orders row keys according to their raw bytes. AntiEntropy is then used to synchronise the replicas of the data to the newest version by periodically comparing the checksums. Merkle trees are used in Cassandra to implement AntiEntropy, just like for Dynamo, but in a slightly different way. For more details, you could read the respective documentation. ‘hinted handoff’ are used to ensure consistency whenever WhH QRGH FRPHs EDFN RQOLQH. ThH FRPSOHWH flRw IRU D write operation involves components like the commit logs, PHPWDEOHs, 66TDEOHs, HWF. ThH FRPPLW ORJs DUH WhH fiUsW failover protection mechanism, where the operation is written so that the written data can be recovered in case of a failure. The memtables then act as an in-memory database, where DOO WhH GDWD Ls uSGDWHG uQWLO LW Ls flushHG WR GLsN LQ WhH IRUP of SSTables. Compaction is then periodically performed to DssLPLODWH GDWD, sR WhDW LW FDQ EH PHUJHG LQWR D sLQJOH fiOH.