Us­ing MonetDB for High­Per­for­mance Ap­pli­ca­tions

MonetDB is a new, col­umn-store data­base par­a­digm for faster ac­cess to and high per­for­mance of ap­pli­ca­tions.

OpenSource For You - - Contents -

In this era of Big Data, it is dif­fi­cult to man­age, store and ma­nip­u­late data. To re­solve this is­sue, there are dif­fer­ent par­a­digms that or­gan­i­sa­tions use to store data, such as struc­tured, un­struc­tured, semi­struc­tured, row based and col­umn based databases. In this ar­ti­cle, we are go­ing to ex­plore a pop­u­lar, open source, free col­umn based data­base called MonetDB. But be­fore we do that, let’s look at the dif­fer­ence be­tween row based (tra­di­tional) and col­umn based databases.

Al­most all tra­di­tional databases store data in the row for­mat, which means in ev­ery ta­ble, the data is stored one after an­other in tu­ple for­mat. In case of col­umn based databases, data is stored col­umn wise (Fig­ure 1).

Col­umn based vs row based databases: The ad­van­tages and dis­ad­van­tages

Let’s con­sider the ad­van­tages of col­umn based databases over row based databases. The former are faster when we need to ac­cess all the data of an en­tire col­umn, be­cause the data is stored to­gether in a sin­gle col­umn. An­other ad­van­tage is that less hard disk ac­cess is re­quired and more val­ues can be stored in a block. Col­umn based databases work faster while per­form­ing sta­tis­ti­cal op­er­a­tions like ag­gre­ga­tion, sum­ma­tion, etc. This is be­cause data is stored to­gether, so ac­cess time and com­pu­ta­tion time de­crease com­pared to in­ter­nal stor­age. When a spe­cific ta­ble in the en­tire col­umn needs to be changed, then a col­umn based data­base is faster. This is be­cause the en­tire data in the col­umn is stored in the same block, mak­ing it eas­ier to fetch and up­date it. A col­umn based data­base is also bet­ter in terms of com­pres­sion, be­cause sim­i­lar types of val­ues are stored in the same block; in row based databases, the en­tire tu­ple, which has dif­fer­ent types of val­ues, is stored in one block.

Row based databases are use­ful when we have more read op­er­a­tions and in­ser­tion of records.

So, the choice of the data­base will de­pend upon the fea­tures that are re­quired. Col­umn based databases are pre­ferred for sta­tis­ti­cal op­er­a­tions, where we have a huge amount of data to man­age and ma­nip­u­late.

MonetDB

MonetDB is a free, open source data­base man­age­ment sys­tem de­vel­oped by the CGI data­base re­search group. The main use case of this data­base is for data ware­hous­ing, where there is a huge amount of data to man­age. It uses a ver­ti­cal frag­men­ta­tion and colum­nar ex­e­cu­tion based en­gine, and is mainly used for Big Data sci­ence. MonetDB + R lan­guage in­te­gra­tion is pop­u­lar among data sci­en­tists.

In­stal­la­tion of MonetDB in Python

There are dif­fer­ent ways to in­stall MonetDB. We can do a stand­alone in­stal­la­tion or in­stall the thin clients pro­vided in dif­fer­ent lan­guages like Python and R. Here, we will ex­plore how to use Python for the in­stal­la­tion of MonetDB, just for demo pur­poses.

1. In­stall the Python­Monetdb pack­age; we are us­ing the Pip util­ity.

2. Run the com­mand given be­low in a CLI prompt:

pip in­stall mon­et­d­blite

3. We can also in­stall stand­alone MonetDB in Win­dows. Down­load the .msi file from https://www.monetdb.org/ down­loads/Win­dows/Jul2017-SP3/.

4. Fig­ure 3 gives the snap­shot for Win­dows in­stal­la­tion.

5. To con­nect to a data­base, use the code given be­low in Python:

# cre­ate a new data­base or con­nect to an ex­ist­ing data­base in /tmp/db im­port mon­et­d­blite conn = mon­et­d­blite.con­nect(‘/tmp/db’) To cre­ate a cur­sor to ac­cess ta­bles and databases, use the fol­low­ing code:

# cre­ate a new cur­sor c = conn.cur­sor()

# query the data­base c.ex­e­cute(‘SE­LECT * FROM ta­bles’) # fetch the re­sults and print them print(c.fetchall())

Like­wise, we can ex­plore more com­mands from the MonetDB Com­mand Ref­er­ence Guide. Ref­er­ences [1] <https://db-en­gines.com/en/sys­tem/ Mi­crosoft+SQL+Server%3BMonetDB> [2] <https://www.monetdb.org/wiki/MonetDB:Get­ting_s­tarted> [3] <https://www.monetdb.org/Doc­u­men­ta­tion/UserGuide/ MonetDB-R> By: Maulik Parekh The au­thor works at Cisco as a con­sult­ing en­gi­neer and has an M. Tech de­gree in cloud com­put­ing from VIT Uni­ver­sity, Chen­nai. He con­stantly strives to learn, grow and in­no­vate. He can be reached at maulik­[email protected] Web­site: https://www.linkedin.com/in/maulik­parekh2

Fig­ure 2: MonetDB in­stal­la­tion with Python

Fig­ure 3: MonetDB Win­dows in­stal­la­tion

Newspapers in English

Newspapers from India

© PressReader. All rights reserved.