MongoDB: Using native drivers
Mihalis Tsoukalos jumps into the popular NoSQL database, MongoDB with a guide to getting started using the Ruby and Python drivers.
NoSQL databases are designed for the web and don’t support joins, complex transactions and other features of the SQL language. MongoDB is an open source NoSQL database written in C++ by Dwight Merriman and Eliot Horowitz which has native drivers for many programming languages, including C, C++, Erlang, Haskell, Perl, PHP, Python, Ruby and Scala. In this tutorial, we’ll cover the MongoDB drivers for Python and Ruby.
The MongoDB document format is based on JSON, and the JSON structures consist of key and value pairs and can nest arbitrarily deep. If you’re not already familiar with JSON, you can think of JSON documents as dictionaries and hash maps that are supported by most programming languages.
The following instructions will help you install MongoDB on an Ubuntu Linux system: $ sudo apt-key adv --keyserver hkp://keyserver.ubuntu.com:80 --recv 7F0CEB10 $ echo "deb http://repo.mongodb.org/apt/ubuntutrusty/ mongodb-org/3.0 multiverse" | sudo tee /etc/apt/sources. list.d/mongodb-org-3.0.list $ sudo apt-get update $ sudo apt-get install -y mongodb-org
The last command installs the latest MongoDB version which, at the time of writing this tutorial, is 3.0.7. On an Ubuntu Linux system, you can install the Ruby interface to the MongoDB database in the following way (provided that Ruby is already installed): $ sudo gem install mongo
Please make sure that you use gem to install the Ruby MongoDB driver because your Linux distribution (distro) might have an older driver version that uses different functions for connecting to the database.
You can install the Python driver by executing sudo aptget install python-pymongo . If you’re using Python 3 you should run sudo apt-get install python3-pymongo instead of the previous command. Alternatively, you can install the Python driver with the sudo pip install pymongo command provided that the pip utility is already installed.
You might need to execute the following JavaScript code from the MongoDB shell in order to insert sample data in your MongoDB database to experiment more while working your way through this tutorial: > use LXF switched to db LXF > for (var i=0; i<10000; i++) { db.sampleData.insert({x:i, y:i/2}); } WriteResult({ "nInserted" : 1 }) > db.sampleData.count(); 10000
What the JavaScript code does is select the LXF database – if LXF doesn’t already exist, it will be automatically created – and insert 10,000 documents in the sampleData collection of the LXF database. You are free to change the name of the database, which is defined in the use LXF command, and the collection, which is defined in the db.sampleData.insert() command; however, all presented Ruby and Python code uses the LXF database.
The db.sampleData.count() command verifies that the sampleData collection has indeed 10000 documents. Should you wish to delete the entire sampleData collection, you should execute the next command: > db.sampleData.drop(); true > db.sampleData.count(); 0
All presented Python and Ruby examples are autonomous and will work without any changes, assuming, of course, that the appropriate collections and databases exist In your MongoDB installation.
The Ruby driver
The Ruby MongoDB driver is written in Ruby and is officially supported by MongoDB. Although it can be used on its own, it is also used by object mapping libraries, such as Mongoid. The driver supports all MongoDB versions, including versions 3.0.x and 2.6. You can find the source code of the driver at https://github.com/mongodb/mongo-ruby-driver.
The following Ruby code ( connect.rb, see the LXFDVD) checks whether you can connect to a MongoDB server and prints the version of the Ruby driver:
require ‘rubygems’ require ‘mongo’ include Mongo $client= Mon go :: Client. new (['127.0.0.1:27017'],: database => 'LXF') Mon go :: Logger. logger. level =:: Logger :: ERROR $collection = $client[:someData] puts 'Connected with version:' puts Mongo::VERSION
If you can successfully execute the code ( above), then you are ready to continue with the rest of the tutorial. Otherwise, try to correct the errors before continuing. The generated output from connect.rb is the following, which means that you are using the 2.1.2 version of the Ruby MongoDB driver: $ ruby connect.rb D, [2015-11-19T10:28:57.085526 #2542] DEBUG -- : MONGODB | Adding 127.0.0.1:27017 to the cluster. Connected with version: 2.1.2
The Mon go :: Client. new () function specifies the IP address of the machine that runs MongoDB as well as the port number that MongoDB listens to – you can use a hostname instead of the IP address. The last parameter ( :database ) defines the name of the database you want to connect to. There are a number of other useful supported parameters, such as :user, :password, :connect_timeout, :replica_set, etc. We’ve supplied a similar program ( connect.py) on the
LXFDVD written in Python ( picturedbottomp84), which uses the official Python MongoDB driver. The program connects to a MongoDB database, randomly reads a document from the sampleData collection of the LXF database and prints the _id and x fields of the document. As you’ll see from the code supplied, both drivers work in an analogous way.
Both connect.rb and connect.py will be used again and again in this tutorial because without a proper connection to
MongoDB, you won’t be able to perform any other operation so make sure that you understand them well, especially their various parameters and variables, before going any further.
Insert, Update and Select Operations
The following code presents a complete example in Ruby (without the required code for connecting to the database), where you can insert multiple documents on the someData collection of a MongoDB database: $collection = $client[:someData] 500.times do |n| doc = { :username => "LinuxFormat", :code => rand(4), # random value between 0 and 3, inclusive :time => Time.now.utc, :n => n*n }
$collection. insert_ one( doc) end
The loop inserts 500 documents, using n as the iterator. The insert_one() function is used for writing a JSON document, which is generated using Ruby code, to the desired collection of the selected database. As you can see, you can insert multiple documents one by one. The following version of find() ( below) uses both $gt and $lt operators to query the someData collection of the LXF database and select documents that match certain criteria: $ collection. find({"n"=>{"$gt "=>205000}," code "=>{"$ lt "=> 1}}).each do |doc|
puts doc end
As you can see, the find() function might return multiple documents. Therefore, you will have to use .each to access the returned documents one by one. The find_one() function returns just one JSON document which means that you should not use an iterator to read its results.
The code ( above) finds all documents where the n key has a value greater than 205000 and the "code" key value is less than 1 and iterates through all the documents that match the conditions in the find() function.
There exist two functions that allow you to perform updates on a MongoDB database: update_one() and update_ many(). The first one updates just one document – if many documents match the criteria only the first one gets updated – whereas the second one updates all documents that match the given criteria. The Ruby example ( below) shows how to update a document using update_many() : result =$ collection. find({"n”=>{"$gt "=>205000}}). update_ many({ "$inc" => { :code => 10 }}) # The next command returns the number of documents that were updated. puts result.n As you can see, you will first need to use find() to select
the documents you want to update and then use update_
many() to actually update the documents.
Listing indexes
The following Ruby code lists all indexes of the someData collection that belong to the LXF database: collection. indexes. each do| index|
p index end
As a collection can have multiple indexes, you must use iteration to get all results from the .indexes function. The full Ruby code can be found inside the indexes.rb file on the LXFDVD. If the someData collection doesn’t exist, you will get an error message similar to the following:
D, [2015-11-21T18:06:32.078495 #18166] DEBUG -- : MONGODB | 127.0.0.1:27017 | LXF.listIndexes | FAILED | no collection (26) | 0.000233s
Please note that the index for the _id field is automatically generated for each collection on a MongoDB database as soon as you insert some data on the collection.
You can also create new users using the Ruby MongoDB driver. The following code shows the technique: $client. database. users. create( ‘linuxFormat’, password: ‘aPass’, roles :[ Mon go :: Au th :: Roles :: READ_ WRITE ])
The complete Ruby code can be found in newUser.rb. You’ll note that the output of newUser.rb changes when executed two times in a row ( pictured,below). The first time the user is successfully generated without any errors; however, the second time the process fails because the user already exists. Note that there is a special database on every
MongoDB instance that keeps user-related data called ‘database’. Also, bear in mind that in order to view the information of another user, you must have the viewUser action on the database of the other user.
The Python driver
The Python driver works with the latest as well as older versions of MongoDB. Its source code can be found at https://github.com/mongodb/mongo-python-driver. You can find the exact version of your Python MongoDB driver by executing the following code: $ python-c" importpym on go; print(pym on go. version )"
2.6.3 It’s now time to go back to the Python program ( pictured
in the sc re en shot,p 84) and explain it a little more. You first call the MongoClient() function with two parameters. The first parameter specifies the IP of the desired MongoDB server and the second specifies the desired port number. You then define the database you want to work with using “db = client.LXF” . Last, you select the desired collection with
sampleData = db.sampleData . After that, you are free to use the sampleData variable to interact with the MongoDB server. More or less, the names of the Python methods you need to call are the same as the Ruby names. The find_one() function used randomly selects a JSON document from the currently selected collection.
It is time to see something more practical. Should you wish to find all JSON documents of the someData collection, you would use the following version of find() as long with a for loop: # Choose a collection someData = db.someData for document in someData.find():
print document
As you’ve already learnt, the main difference between find() and find_one() is that the former returns a cursor that you will have to iterate in order to get all returned documents, whereas find_one() randomly returns a single JSON document that you just need to store using a variable.
The following version of find() ( below) uses $gt to query the someData collection of the LXF database and sort the results by the n field using the sort() method: for doc in someData.find({"code": {"$gt": 30}}).sort("n"):
print doc
Other useful MongoDB operators include $ne (not equal), $lte (less than or equal), $gte (greater than or equal) etc.
The next Python code ( update.py) shows how to update an existing document: # Choose a document. print some Data. find_ one({"n ”:1}) # Update it some Data. u pd ate({"n ”:1},{'$ set ':{' new Field ':10}}) # Print it again. print some Data. find_ one ({' new Field ':10})
The following Python code ( drop.py) drops the moreData collection from the LXF database and should be used with great care: # Choose a collection moreData = db.moreData # Drop the entire collection! print moreData.drop()
Please note that the following two calls are equivalent: >>> db.a Collection. drop () >>> db. drop_ collection (" a Collection ")
Creating MongoDB Indexes
In order to create a single key ascending index for the n key, you should use the following Python code: someData.create_index("n")
It is considered a good practice to check the correctness of your programs before using them in production. ( There’s anexampleofthispicturedonp85,where,usingthe mongoDBshell,thegetIndexes()functionverifiesthatthe newindexwassuccessfullycreated.) The index.py file on the LXFDVD contains the full Python code. Similarly, you can drop an existing index using Python as follows: # Choose a collection someData = db.someData # Drop an index someData.drop_index("n_1")
The dropIndex.py file contains the Python code. If the index you specified does not exist, you’ll get an error message similar to the following: pymongo.errors.OperationFailure: command SON([('dropIndexes’, u’someData'), ('index’, ‘n_1')]) failed: index not found with name [n_1]
Not everything should be done with Ruby or Python as administrative tasks are better done using the Mongo shell and the JavaScript programming language. Therefore, in another tutorial we’ll teach you how to administer MongoDB and create Replica sets from the MongoDB shell.
However, knowing how to use MongoDB with your favourite programming language can be very handy and enjoyable, so in a third tutorial we will teach you how to create a blog site using MongoDB, the Python driver, the knowledge from this tutorial and the Bottle framework. MongoDB is a great and modern database, so stay tuned for more!