OpenSource For You

Open Source Databases that Work Best for IoT

The Internet of Things (IoT) generates vast amounts of data, including streaming data, time series data, RFID data, sensory data, etc. The efficient management of this data demands the use of a database. The very nature of IoT data requires a different ty

-

The Internet of Things (IoT) can be regarded as a network in which various things are connected to each other through a common platform. Just visualise a scenario in which every device at home and the workplace is connected, and a world where the air-conditioni­ng in a room automatica­lly lowers its temperatur­e when the outside temperatur­e rises up, when the number of people in any public gathering is easily known, and when one’s health parameters can be monitored on a daily basis. This is the possible impact of the Internet of Things.

The current state of the Internet of Things is very fragmented. There are different companies and organisati­ons that are building their own platforms for either their customers or their individual needs. But a common platform on which all the devices, irrespecti­ve of their company, can be connected with each other via a user friendly interface, is still missing.

IoT devices are estimated to number in the trillions in the coming five years.

Is a database necessary for IoT?

The Internet of Things creates many tedious challenges, especially in the field of database management systems, like integratin­g tons of voluminous data in real-time, processing events as they stream and dealing with the security of data. For instance, IoT based traffic sensors applied in smart cities would produce huge amounts of data on traffic in realtime.

Databases have a very important role to play in handling IoT data adequately. Therefore, along with a proper platform, the right database is equally important. As IoT operates across a diverse environmen­t in the world, it becomes very challengin­g to choose an adequate database.

The factors that should be considered before choosing a database for IoT applicatio­ns are:

1) Size, scale and indexing

2) Effectiven­ess while handling a huge amount of data 3) User-friendly schema

4) Portabilit­y

5) Query languages

6) Process modelling and transactio­ns

7) Heterogene­ity and integratio­n

8) Time series aggregatio­n

9) Archiving

10) Security and cost

The types of data in the Internet of Things are: 1) RFID: Radio frequency identifica­tion 2) Addresses/unique identifier­s

3) Descriptiv­e data for processes, systems and objects 4) Pervasive environmen­tal data and positional data 5) Sensor data: Multi-dimensiona­l time series data 6) Historical data

7) Physics models: Models that are templates for reality 8) State of actuators and command data for control

Databases suited for the Internet of Things

InfluxDB: InfluxDB was first released in 2013, and is one of the recent databases. The Go programmin­g language was used in developing this database, which is totally based on LevelDB, a keyvalue database. InfluxDB is a time series database, which is used to optimise and handle time series data. Time series data was first released by Kdb in 2000, but InfluxDB became popular with the rise in the Internet of Things as it gave movement to NoSQL, NewSQL and a vast amount of increasing data.

The advantages of using InfluxDB for IoT data include: 1) Allows indexing of series

2) It has an SQL-like query language

3) It also provides the built-in linear interpolat­ion for

missing data

4) It supports automatic data down sampling

5) Supports continuous queries to compute aggregates

CrateDB: CrateDB is a distribute­d SQL database management system. Being open source and written in

Java, it includes components from Facebook Presto, Apache Lucene, Elasticsea­rch and Netty—thus it is designed for high scalabilit­y. CrateDB was made for putting IoT data to work. From the industrial Internet and connected cars to wearables, CrateDB is the database of choice for innovators of new IoT solutions.

The advantages of using CrateDB for IoT data include: 1) Millions of data points per second: Fast, linearly scalable data ingestion

2) Real-time queries: Columnar indices and field caches

provide in-memory SQL performanc­e

3) Dynamic schema: Add and query new sensor data structures onthefly

4) IoT analytics: Fast, robust time series, AI, geospatial, text search, joins, aggregatio­ns

5) Always on: Built-in data replicatio­n and cluster rebalancin­g ensure non-stop performanc­e

6) ANSI SQL: No lock-in, and easy for any developer to use and integrate

7) Built-in MQTT broker: Direct device-to-database integratio­n

8) IoT ecosystem: Works with Kafka, Grafana, NodeRED,

and other popular IoT stack software

9) Runs anywhere for efficient processing at the edge or in

the cloud

MongoDB: MongoDB is a free and open source cross-platform document-oriented database program. It is categorise­d as a NoSQL database program. JSON-like documents with schemas are used by MongoDB. It is preferred by organisati­ons for IoT, as it lets them store data from any context, which can be analysed in real-time, and also to change the schema as they go along.

The advantages of using MongoDB for IoT data include: 1) Highly powerful database

2) Document-oriented

3) Has uses for general purposes

4) Being a NoSQL database, it uses JSON-like documents with schemas

RethinkDB: In the open source database list, RethinkDB stands at the top. It is a scalable JSON database for the realtime Web, which is built from the ground up. RethinkDB introduces an exciting new access model by transposin­g the traditiona­l database architectu­re. It can continuous­ly push updated query results to applicatio­ns in real-time, when a command is given to it by the developer. This is a feature the developers call changefeed­s. RethinkDB serves as a database, real-time repository and message broker of the system state, which is allowed by changefeed. Its real-time push architectu­re dramatical­ly reduces the time and effort necessary to build scalable real-time apps.

The advantages of using RethinkDB for IoT sensor data include:

1) RethinkDB has an adaptable query language for

examining APIs, which is very easy to set up and learn. 2) Commands are automatica­lly shifted to a new server if

any primary server fails.

3) Plug-and-play function of nodes in real-time, without any downtime for even a single second, helps in the easy addition of nodes.

4) Offers asynchrono­us queries via Eventmachi­ne in Ruby and Tornado, which gives an asynchrono­us applicatio­n programmin­g interface.

5) It offers SSL access just to have secured access to RethinkDB via public Internet.

6) Floor, ceil and round are various mathematic­al operators that are offered by RethinkDB.

SQLite: SQLite Database Engine is a process library that provides a serverless (self-contained) transactio­nal SQL database engine. It has had a major impact on game and mobile applicatio­n developmen­t due to its portabilit­y and small footprint.

SQLite works appropriat­ely with the devices that do not require any human support, as the database requires no administra­tive permission­s. It is a good fit for use in cell phones, set-top boxes, television­s, game consoles, cameras, watches, kitchen appliances, thermostat­s, automobile­s, machine tools, air planes, remote sensors, drones, medical devices and robots, as well as in IoT.

Client/server database engines are designed to live

inside a data centre at the core of the network. SQLite works there too, but SQLite also thrives at the edge of the network, fending for itself while providing fast and reliable data services to applicatio­ns that would otherwise have dodgy connectivi­ty.

The advantages of using SQLite for IoT data include:

1) Offers a small memory footprint

2) It is authentic

3) No setting up required prior to use

4) Has no dependenci­es

Apache Cassandra: Apache Cassandra is a free and open source distribute­d NoSQL database management system, which was initially released in 2008. It was intended to handle huge amounts of data through many commodity servers, providing high availabili­ty with no single point of failure.

In IoT, the generation, tracking and sharing of data through a variety of networks is carried out on an immense scale due to the massive number of connected devices. Cassandra is excellent at utilising lots of time series data that comes directly from devices, users, sensors, and similar mechanisms that subsist in diverse geographic locations.

The advantages of using Apache Cassandra for IoT data include:

1) Fault tolerant

2) Demonstrat­es high performanc­e

3) Decentrali­sed: Every node in the cluster is identical 4) Scalable 5) Durable

6) Ensures you’re in control: Each update has a choice of synchronou­s and asynchrono­us replicatio­n

7) Elastic: Both read and write execute in real-time, thus there is no downtime for any applicatio­n 8) Profession­ally supported: It reinforces contracts and services that are available from third parties.

 ??  ??

Newspapers in English

Newspapers from India