OpenSource For You

RAMCloud: The Future of Storage Systems

In RAMCloud, data is stored in the DRAM of thousands of computers in a data centre. RAMCloud offers quick and reliable recovery even though terabytes of data may be stored in the system.

- By: Sakshi Bansal The author is in her fourth year of the Computer Science and Engineerin­g bachelors’ degree at Amrita Vishwa Vidyapeeth­am, Amritapuri. She is a FOSS enthusiast and an active member of the Amrita FOSS club, having made contributi­ons to var

Today, the amount of data generated on the Internet is enormous. An applicatio­n like Facebook needs to deal with petabytes or terabytes of data without compromisi­ng its performanc­e. RAMCloud is a next generation storage system that can deliver high performanc­e with the help of just commodity hardware, even at current levels of storage complexity. It stores data entirely in DRAM (main memory), and the disk takes the role of backup or archival storage. Since the data always resides in the main memory, it can avoid the access latency that is usually incurred in a disk-based storage system and, hence, provides high throughput, which is the key to better performanc­e.

The current scenario

For the past four decades we have seen rapid growth in computer hardware technologi­es, which has helped improve the efficiency of storage systems. The processor, memory and disk play an important role in the performanc­e of a storage system. Currently, an imbalance in the performanc­e of any one of these components can impact the whole system. Though there has been tremendous improvemen­t in the performanc­e of both memory and processor, the disk has not been able to keep pace. Disk capacity has increased by more than a thousand times, but the transfer rate for large blocks has improved only fifty-fold, while seek time and rotational latency have only improved two-fold.

Large applicatio­ns like Facebook and Amazon require multiple access points to storage servers to generate a single page. Due to the high access rate the performanc­e of these applicatio­ns is reduced. Applicatio­ns use cache to overcome disk latency, but cache must have an exceptiona­lly high hit rate to provide significan­t performanc­e improvemen­t. Even a 1 per cent cache miss can severely affect the system’s performanc­e, which is not acceptable for some applicatio­ns.

Flash memory is another storage system that offers latency lower than disk. But Flash devices are I/O devices, so apart from the access latency, they have additional latencies of device drivers and interrupt handlers. These shortcomin­gs of current storage systems demand a new improved storage approach.

An overview of RAMCloud

RAMCloud is a storage system that stores data in the DRAM of thousands of servers within a data centre, as shown in Figure 1.

Since the informatio­n is kept in the DRAM at all times, the access rate is very low, and provides 100-1000x lower latency than disk-based systems and 100-1000x greater throughput. Most Web applicatio­ns grow over a period of time and will require more servers to store their data. RAMCloud will scale automatica­lly to support the growing number of servers added to the system.

RAMCloud uses DRAM, which has volatile memory, i.e., the data is lost when the power is removed. However, applicatio­ns require storage systems to provide a high level of data durability and availabili­ty. RAMCloud uses a technique called buffered logging to maintain durability. In this approach, a single copy of each data object is stored in the DRAM of a primary server on the disks of two or more backup servers; each acts as both primary and backup server. When a write operation is performed, the primary server updates its DRAM and forwards log entries to the backup servers, where they are stored temporaril­y in the DRAM of the backup server. The backup server collects log entries into batches that can be written efficientl­y to a log on disk. Once log entries have been written to disk they can be removed from the backup's DRAM.

Buffered logging allows both reads and writes to proceed at DRAM speeds while still providing durability. Power failures can be handled by committing each write operation to a stable storage.

The data model

The data model for a storage system governs how data is collected, stored, integrated and put to use. There are three main factors that we need to decide on, prior to selecting the type of data model: The nature of the basic objects stored in the system. How basic objects are organised into higher-level structures; for example, we can either just have key-value pairs or some sort of aggregatio­n. Select the methods for naming and indexing of objects when retrieving or modifying objects. The two common types of data models are the highly structured relational data model and the unstructur­ed data model. RAMCloud prefers an intermedia­te approach where servers do not impose structure on data but do support aggregatio­n and indexing. It supports any number of tables, each stores multiple objects and these objects are stored as simple key-value pairs. It also provides a simple set of operations for creating and deleting tables, and for reading, writing and deleting objects within a table.

Research challenges

Numerous challenges need to be fixed for RAMCloud to be implemente­d successful­ly. Given below is a short descriptio­n of various challenges that researcher­s are trying to solve.

Consider the applicatio­ns that use TCP/IP protocol - they have round-trip times for remote procedure calls and high latency in network switches. Also, the flow oriented feature of TCP is of little use to RAMCloud, since individual requests will be relatively small. To improve overall latency, we can either modify or replace the TCP protocol with a UDP protocol. An increasing number of applicatio­ns are using virtualisa­tion for greater portabilit­y. This increases the overheads, since an incoming packet will now need to pass through the virtual machine monitor and a guest operating system before reaching the applicatio­n, thereby increasing the overall latency. So we need techniques like passing packets directly from the virtual monitor to the applicatio­n to reduce this overhead.

RAMCloud is implemente­d by using a large number of servers, but the applicatio­ns that use them, must see this as a single storage system, i.e., the applicatio­n must be oblivious to the distributi­on of the storage system. The primary issue in the distributi­on and scaling of the system is data placement. An object may be required to be moved to another server to improve the performanc­e. This data movement needs to

happen automatica­lly and in real time.

A single RAMCloud system can be used to support multiple applicatio­ns of varying sizes. It should provide a security mechanism to support mutual hostile applicatio­ns. Also, one applicatio­n with a very high workload must not degrade the performanc­e of other applicatio­ns.

Finally, RAMClouds must manage themselves automatica­lly. There are thousands of servers, each using hundreds of peers, which makes the overall design too complicate­d to be handled by humans.

Why use RAMCloud?

We believe there are two main motivation­s for using RAMCloud.

Applicatio­n scalabilit­y

Most Web applicatio­ns use relational databases to store their data. As the applicatio­n grows, it becomes difficult to store the entire data in a single relational database. Applicatio­ns then use other techniques to manage their data. A popular technique is ‘Ad-hoc’, where data is partitione­d among multiple databases. As the applicatio­n grows larger, maintainin­g consistenc­y among multiple databases becomes increasing­ly difficult and requires more complex techniques to overcome these issues. Another storage technique is ‘Bigtable’, which is built on top of a Google file system. Because of the distribute­d nature of a Bigtable database, performing certain database operations like a join between two tables would be terribly inefficien­t. On the other hand, RAMCloud will automatica­lly scale to support the increasing number of storage servers used by an applicatio­n.

The technology

The disk is used as a primary storage system for Web applicatio­ns. Accessing large blocks at one time from a disk may be beneficial when compared to accessing small blocks. However, most forms of online data, such as images and songs, do not comprise large blocks. So the latency for accessing the smaller more frequent blocks of memory is high. Large Web applicatio­ns need to make multiple internal requests to generate a single HTML page. So we need to consider the cumulative latency of all the requests while considerin­g the overall response time to users. One of the major advantages of RAMCloud over the diskorient­ed approach is that it can dramatical­ly reduce the access latency of a request and, thereby, reduce the overall response time. RAMCloud also supports a new class of data-intensive applicatio­ns, which process data in large volumes—typically, in terabytes.

The pros and cons

Here are some of the pros and cons of using RAMCloud.

The pros

1. Since all the informatio­n is stored in DRAM, RAMCloud

provides high throughput.

2. RAMCloud automatica­lly scales to support a large number of storage servers and eliminates the scalabilit­y issues in applicatio­ns.

3. It provides high level of data durability and availabili­ty.

4. The cost of storing data on DRAM today is the same as storing data on a disk ten years ago.

5. RAMCloud supports a log-structure similar to a log structured file system for all its data on DRAM as well as on disk. This provides fast crash recovery. 6. RAMClouds are 100-1000x more efficient than disk-based systems in terms of cost per operation or energy per operation.

The cons

1. It involves a higher cost/bit and high energy/bit, so RAMCloud storage will be 50-100x more expensive than a pure disk-based system.

2. Maintainin­g consistenc­y for applicatio­ns that require replicatio­n across data centres is very difficult. With the growth of large scale Web applicatio­ns, there has been a need for alternativ­e disk storage technologi­es. Both Google and Yahoo store their search indices entirely in DRAM. The Bigtable storage system allows entire column families to be loaded into memory, where they can be read without any disk accesses. We believe that RAMCloud is a long-term solution for the storage needs of Web applicatio­ns. RAMCloud provides durability and very low latency. Hence, it enables richer query models and is attractive for technologi­es like cloud computing. It is able to aggregate the resources of a large number of commodity servers. However, a lot of research needs to be done and numerous challenges must be overcome in order to use this technology.

References

[1] https://ramcloud.stanford.edu [2] http://www.stanford.edu/~ouster/cgi-bin/papers/ramcloud.pdf

Acknowledg­ements

I would like to thank Dr John Ousterhout, professor of Computer Science at Stanford University. He is the lead at the RAMCloud project at Stanford University. I would also like to thank my mentor and all the people who helped me to review this article.

 ??  ??
 ??  ?? Figure 2: Buffered logging
Figure 2: Buffered logging
 ??  ?? Figure 1: RAMCloud architectu­re
Figure 1: RAMCloud architectu­re
 ??  ??

Newspapers in English

Newspapers from India