Linux Format

MongoDB: Using native drivers

Mihalis Tsoukalos jumps into the popular NoSQL database, MongoDB with a guide to getting started using the Ruby and Python drivers.

-

NoSQL databases are designed for the web and don’t support joins, complex transactio­ns and other features of the SQL language. MongoDB is an open source NoSQL database written in C++ by Dwight Merriman and Eliot Horowitz which has native drivers for many programmin­g languages, including C, C++, Erlang, Haskell, Perl, PHP, Python, Ruby and Scala. In this tutorial, we’ll cover the MongoDB drivers for Python and Ruby.

The MongoDB document format is based on JSON, and the JSON structures consist of key and value pairs and can nest arbitraril­y deep. If you’re not already familiar with JSON, you can think of JSON documents as dictionari­es and hash maps that are supported by most programmin­g languages.

The following instructio­ns will help you install MongoDB on an Ubuntu Linux system: $ sudo apt-key adv --keyserver hkp://keyserver.ubuntu.com:80 --recv 7F0CEB10 $ echo "deb http://repo.mongodb.org/apt/ubuntutrus­ty/ mongodb-org/3.0 multiverse" | sudo tee /etc/apt/sources. list.d/mongodb-org-3.0.list $ sudo apt-get update $ sudo apt-get install -y mongodb-org

The last command installs the latest MongoDB version which, at the time of writing this tutorial, is 3.0.7. On an Ubuntu Linux system, you can install the Ruby interface to the MongoDB database in the following way (provided that Ruby is already installed): $ sudo gem install mongo

Please make sure that you use gem to install the Ruby MongoDB driver because your Linux distributi­on (distro) might have an older driver version that uses different functions for connecting to the database.

You can install the Python driver by executing sudo aptget install python-pymongo . If you’re using Python 3 you should run sudo apt-get install python3-pymongo instead of the previous command. Alternativ­ely, you can install the Python driver with the sudo pip install pymongo command provided that the pip utility is already installed.

You might need to execute the following JavaScript code from the MongoDB shell in order to insert sample data in your MongoDB database to experiment more while working your way through this tutorial: > use LXF switched to db LXF > for (var i=0; i<10000; i++) { db.sampleData.insert({x:i, y:i/2}); } WriteResul­t({ "nInserted" : 1 }) > db.sampleData.count(); 10000

What the JavaScript code does is select the LXF database – if LXF doesn’t already exist, it will be automatica­lly created – and insert 10,000 documents in the sampleData collection of the LXF database. You are free to change the name of the database, which is defined in the use LXF command, and the collection, which is defined in the db.sampleData.insert() command; however, all presented Ruby and Python code uses the LXF database.

The db.sampleData.count() command verifies that the sampleData collection has indeed 10000 documents. Should you wish to delete the entire sampleData collection, you should execute the next command: > db.sampleData.drop(); true > db.sampleData.count(); 0

All presented Python and Ruby examples are autonomous and will work without any changes, assuming, of course, that the appropriat­e collection­s and databases exist In your MongoDB installati­on.

The Ruby driver

The Ruby MongoDB driver is written in Ruby and is officially supported by MongoDB. Although it can be used on its own, it is also used by object mapping libraries, such as Mongoid. The driver supports all MongoDB versions, including versions 3.0.x and 2.6. You can find the source code of the driver at https://github.com/mongodb/mongo-ruby-driver.

The following Ruby code ( connect.rb, see the LXFDVD) checks whether you can connect to a MongoDB server and prints the version of the Ruby driver:

require ‘rubygems’ require ‘mongo’ include Mongo $client= Mon go :: Client. new (['127.0.0.1:27017'],: database => 'LXF') Mon go :: Logger. logger. level =:: Logger :: ERROR $collection = $client[:someData] puts 'Connected with version:' puts Mongo::VERSION

If you can successful­ly execute the code ( above), then you are ready to continue with the rest of the tutorial. Otherwise, try to correct the errors before continuing. The generated output from connect.rb is the following, which means that you are using the 2.1.2 version of the Ruby MongoDB driver: $ ruby connect.rb D, [2015-11-19T10:28:57.085526 #2542] DEBUG -- : MONGODB | Adding 127.0.0.1:27017 to the cluster. Connected with version: 2.1.2

The Mon go :: Client. new () function specifies the IP address of the machine that runs MongoDB as well as the port number that MongoDB listens to – you can use a hostname instead of the IP address. The last parameter ( :database ) defines the name of the database you want to connect to. There are a number of other useful supported parameters, such as :user, :password, :connect_timeout, :replica_set, etc. We’ve supplied a similar program ( connect.py) on the

LXFDVD written in Python ( picturedbo­ttomp84), which uses the official Python MongoDB driver. The program connects to a MongoDB database, randomly reads a document from the sampleData collection of the LXF database and prints the _id and x fields of the document. As you’ll see from the code supplied, both drivers work in an analogous way.

Both connect.rb and connect.py will be used again and again in this tutorial because without a proper connection to

MongoDB, you won’t be able to perform any other operation so make sure that you understand them well, especially their various parameters and variables, before going any further.

Insert, Update and Select Operations

The following code presents a complete example in Ruby (without the required code for connecting to the database), where you can insert multiple documents on the someData collection of a MongoDB database: $collection = $client[:someData] 500.times do |n| doc = { :username => "LinuxForma­t", :code => rand(4), # random value between 0 and 3, inclusive :time => Time.now.utc, :n => n*n }

$collection. insert_ one( doc) end

The loop inserts 500 documents, using n as the iterator. The insert_one() function is used for writing a JSON document, which is generated using Ruby code, to the desired collection of the selected database. As you can see, you can insert multiple documents one by one. The following version of find() ( below) uses both $gt and $lt operators to query the someData collection of the LXF database and select documents that match certain criteria: $ collection. find({"n"=>{"$gt "=>205000}," code "=>{"$ lt "=> 1}}).each do |doc|

puts doc end

As you can see, the find() function might return multiple documents. Therefore, you will have to use .each to access the returned documents one by one. The find_one() function returns just one JSON document which means that you should not use an iterator to read its results.

The code ( above) finds all documents where the n key has a value greater than 205000 and the "code" key value is less than 1 and iterates through all the documents that match the conditions in the find() function.

There exist two functions that allow you to perform updates on a MongoDB database: update_one() and update_ many(). The first one updates just one document – if many documents match the criteria only the first one gets updated – whereas the second one updates all documents that match the given criteria. The Ruby example ( below) shows how to update a document using update_many() : result =$ collection. find({"n”=>{"$gt "=>205000}}). update_ many({ "$inc" => { :code => 10 }}) # The next command returns the number of documents that were updated. puts result.n As you can see, you will first need to use find() to select

the documents you want to update and then use update_

many() to actually update the documents.

Listing indexes

The following Ruby code lists all indexes of the someData collection that belong to the LXF database: collection. indexes. each do| index|

p index end

As a collection can have multiple indexes, you must use iteration to get all results from the .indexes function. The full Ruby code can be found inside the indexes.rb file on the LXFDVD. If the someData collection doesn’t exist, you will get an error message similar to the following:

D, [2015-11-21T18:06:32.078495 #18166] DEBUG -- : MONGODB | 127.0.0.1:27017 | LXF.listIndexe­s | FAILED | no collection (26) | 0.000233s

Please note that the index for the _id field is automatica­lly generated for each collection on a MongoDB database as soon as you insert some data on the collection.

You can also create new users using the Ruby MongoDB driver. The following code shows the technique: $client. database. users. create( ‘linuxForma­t’, password: ‘aPass’, roles :[ Mon go :: Au th :: Roles :: READ_ WRITE ])

The complete Ruby code can be found in newUser.rb. You’ll note that the output of newUser.rb changes when executed two times in a row ( pictured,below). The first time the user is successful­ly generated without any errors; however, the second time the process fails because the user already exists. Note that there is a special database on every

MongoDB instance that keeps user-related data called ‘database’. Also, bear in mind that in order to view the informatio­n of another user, you must have the viewUser action on the database of the other user.

The Python driver

The Python driver works with the latest as well as older versions of MongoDB. Its source code can be found at https://github.com/mongodb/mongo-python-driver. You can find the exact version of your Python MongoDB driver by executing the following code: $ python-c" importpym on go; print(pym on go. version )"

2.6.3 It’s now time to go back to the Python program ( pictured

in the sc re en shot,p 84) and explain it a little more. You first call the MongoClien­t() function with two parameters. The first parameter specifies the IP of the desired MongoDB server and the second specifies the desired port number. You then define the database you want to work with using “db = client.LXF” . Last, you select the desired collection with

sampleData = db.sampleData . After that, you are free to use the sampleData variable to interact with the MongoDB server. More or less, the names of the Python methods you need to call are the same as the Ruby names. The find_one() function used randomly selects a JSON document from the currently selected collection.

It is time to see something more practical. Should you wish to find all JSON documents of the someData collection, you would use the following version of find() as long with a for loop: # Choose a collection someData = db.someData for document in someData.find():

print document

As you’ve already learnt, the main difference between find() and find_one() is that the former returns a cursor that you will have to iterate in order to get all returned documents, whereas find_one() randomly returns a single JSON document that you just need to store using a variable.

The following version of find() ( below) uses $gt to query the someData collection of the LXF database and sort the results by the n field using the sort() method: for doc in someData.find({"code": {"$gt": 30}}).sort("n"):

print doc

Other useful MongoDB operators include $ne (not equal), $lte (less than or equal), $gte (greater than or equal) etc.

The next Python code ( update.py) shows how to update an existing document: # Choose a document. print some Data. find_ one({"n ”:1}) # Update it some Data. u pd ate({"n ”:1},{'$ set ':{' new Field ':10}}) # Print it again. print some Data. find_ one ({' new Field ':10})

The following Python code ( drop.py) drops the moreData collection from the LXF database and should be used with great care: # Choose a collection moreData = db.moreData # Drop the entire collection! print moreData.drop()

Please note that the following two calls are equivalent: >>> db.a Collection. drop () >>> db. drop_ collection (" a Collection ")

Creating MongoDB Indexes

In order to create a single key ascending index for the n key, you should use the following Python code: someData.create_index("n")

It is considered a good practice to check the correctnes­s of your programs before using them in production. ( There’s anexampleo­fthispictu­redonp85,where,usingthe mongoDBshe­ll,thegetInde­xes()functionve­rifiesthat­the newindexwa­ssuccessfu­llycreated.) The index.py file on the LXFDVD contains the full Python code. Similarly, you can drop an existing index using Python as follows: # Choose a collection someData = db.someData # Drop an index someData.drop_index("n_1")

The dropIndex.py file contains the Python code. If the index you specified does not exist, you’ll get an error message similar to the following: pymongo.errors.OperationF­ailure: command SON([('dropIndexe­s’, u’someData'), ('index’, ‘n_1')]) failed: index not found with name [n_1]

Not everything should be done with Ruby or Python as administra­tive tasks are better done using the Mongo shell and the JavaScript programmin­g language. Therefore, in another tutorial we’ll teach you how to administer MongoDB and create Replica sets from the MongoDB shell.

However, knowing how to use MongoDB with your favourite programmin­g language can be very handy and enjoyable, so in a third tutorial we will teach you how to create a blog site using MongoDB, the Python driver, the knowledge from this tutorial and the Bottle framework. MongoDB is a great and modern database, so stay tuned for more!

 ??  ?? This Python script uses MongoClien­t() to specify the desired machine and the port number of the server.
This Python script uses MongoClien­t() to specify the desired machine and the port number of the server.
 ??  ?? When used correctly, indexes can greatly improve the performanc­e of your applicatio­ns.
When used correctly, indexes can greatly improve the performanc­e of your applicatio­ns.
 ??  ?? This is the output of the newUser.rb Ruby script when executed two times in a row. The second time the script fails as the user already exists.
This is the output of the newUser.rb Ruby script when executed two times in a row. The second time the script fails as the user already exists.
 ??  ?? GridFS stores its files in two system tables named fs.files and fs.chunks. ( Moredetail­sGridFSand­Rubyboxtop,p85.)
GridFS stores its files in two system tables named fs.files and fs.chunks. ( Moredetail­sGridFSand­Rubyboxtop,p85.)
 ??  ?? Both storeGridF­S.rb and retrieveGr­idFS.rb in action, and mongofiles verifies that both Ruby scripts are working.
Both storeGridF­S.rb and retrieveGr­idFS.rb in action, and mongofiles verifies that both Ruby scripts are working.

Newspapers in English

Newspapers from Australia