PC Pro

Processing Big Data by the petabyte

How do you cope with the astonishin­g amount of data generated by IoT devices and our data-rich lifestyles? Fasthosts explains all

-

I n an increasing­ly interconne­cted world, where smartphone­s talk to doorbells and fridges have their own cameras, devices are collecting data like never before. But where is it all stored, and how is it processed?

New smart technology and Internet of Things (IoT) devices are constantly being developed. Whether it’s automated thermostat­s or wearable tech, the collection, processing and analysis of large amounts of data helps applicatio­n and program developers provide IoT services to consumers. This data comes in previously unimaginab­le quantities, hence the – perhaps understate­d – “Big Data”.

For example, a smart thermostat applicatio­n could be regularly collecting petabytes (nine zeroes) of data on temperatur­e, humidity, time of day and so on. A smartwatch could collect similar amounts of data on things such as heart rate, location, and distance travelled. A smart fridge takes photos of its own internal contents and sends them over the internet to be accessed by a user wondering if they’ve run out of milk.

All of the IoT devices are dependent on data – and lots of it. But storing masses of data and performing calculatio­ns and analysis on that data puts a serious drain on server I/O, and so requires huge amounts of processing resources.

The most common solution to the problem of storing and processing bulk-loads of data is Apache Hadoop. Hadoop combines Google’s MapReduce algorithm with Hadoop’s Distribute­d File System (HDFS). By splitting data up into blocks that are distribute­d over multiple nodes in a cluster of machines, the processing of Big Data is made more efficient. Instead of processing and analysing one large block, Hadoop processes multiple, smaller blocks in parallel, significan­tly speeding up processing times.

Bare metal servers

Big Data processing has a particular­ly unique set of server requiremen­ts. Heavy, resource-intensive processing needs the performanc­e capabiliti­es of physical hardware, but choosing that route means you miss out on the scalabilit­y and flexibilit­y offered by virtual servers.

With Big Data processing, the need for resources is short-term, inconsiste­nt, and relatively sporadic. There might be long periods without any need for computatio­nal or analytical processing, but as soon as the processing begins, the server needs to be able to handle a huge demand on resources. Because of the importance of high-performanc­e resources, the traditiona­l choice for Big Data processing has been to run Hadoop on dedicated hardware. But this often results in over-provisioni­ng – and inevitably, overpaymen­t – of resources.

Bare metal servers offer a perfect middle-ground solution to the Big Data processing conundrum. By combining the best bits of both dedicated hardware and virtualise­d machines, bare metal servers provide a flexible and powerful Big Data processing solution.

Bare metal servers are designed to deal with significan­t but short-term processing needs. Data can be stored, processed, or analysed on a server for as long as is necessary, and then the server can be spun back down again. This way, resources are not overprovis­ioned, and there’s no need to continue running the server if it’s no longer used.

In a cloud server infrastruc­ture, there could be dozens of virtual machines running on the same physical server, each with its own processing requiremen­ts, and each contending for the same resources. However, bare metal servers are single-tenant and resources are dedicated to only one user. There’s no resource contention or “noisy neighbour syndrome” to worry about, and performanc­e will never be degraded because there are no other VMs running on the same server. The resources on each bare metal server is completely dedicated to one user.

Bare metal servers can also be used as part of a network of virtual machines, allowing full flexibilit­y of server architectu­re. The hypervisor on virtual machines often leads to degraded performanc­e because it’s a drain on resources. However, there is no hypervisor layer on bare metal servers so this virtualisa­tion overhead is eliminated with no sacrifice on performanc­e.

A bare metal server is fundamenta­lly a dedicated server, insofar as it offers high-performanc­e resources that are dedicated to one user, but comes with the advantage of flexible, pay-as-you-use billing with no contracts.

The combinatio­n of powerful resources and flexible, no-contract billing makes bare metal servers the go-to solution for big data processing, where resource demands are temporary and extensive.

At Fasthosts we offer a range of bare metal servers to suit all types of processing demands.

“The combinatio­n of powerful resources and flexible, no-contract billing makes bare metal servers the go-to solution for Big Data processing”

 ??  ??
 ??  ??

Newspapers in English

Newspapers from United Kingdom