In part 1 of my adventures in MQTT, MongoDB and REST  I described the project I’m working on in my free time, and how I want to use it to explore these technologies. In part 2, the focus will be on MongoDB, since that was really the first part of the puzzle I needed to get working. Once I have the database in place, I can start feeding it with data, and creating the API to access it.

Why MongoDB?

Seeing as I wanted to explore some technology I’m not familiar with, picking a NoSQL database like MongoDB would be an answer in its own right. While this is true, it’s not the whole story because there are some good technical reasons which make the choice a sensible one for this project.

The lack of a defined schema allows me to spend more time developing the pointy end of the project rather than being a DBA.  I can define a sensible document structure now, and then if I decide to add or change something later, it’s much quicker and easier than having to update a schema and migrate the existing data to it . Essentially it lets me be more flexible with my approach and given the limited time I spend on the project, it’s a sensible strategy for both the short and medium term.

The ‘JSON’ format for data is also a a good reason for using MongoDB. Not only is this something else I want to be more hands on with, it’s easy to work with in the various languages I’m using for the project (currently Perl, PHP and Java/Android). All these languages have utility classes for handling JSON, and save me from having to man-handle data from various tables into a manageable structured format. It’s also a little less verbose than XML, and might shave some bytes off the payload which gets sent to the mobile clients – every little helps.

With an eye to the future, I was also interested in how MongoDB scales. I’m not expecting to have to use this in the near term, and if I do, it looks like sharding is a useful way to get the best out of the back end servers, rather than just duplicating the entire database. This is particularly crucial for a low budget production.

Finally, why MongoDB specifically, rather than one of the other NoSQL database out there? Mainly because it’s the one I’m most familiar with in terms of exposure and articles. I could have spent time weighing up the choices and this project is more about the learning and experience than nailing down a specific set of technical requirements and determining the absolute best fit for the job.

 

RockMongo

Getting MongoDB up and running is pretty straightforward, so I won’t dwell too much on it. The host system is running CentOS, and installing was just a case of following the relevant instructions and starting the service. I also needed drivers for Perl and PHP, and again, it was simply a case of following the instructions.

This gave me a working system, and it lacked a little something – a decent UI for simple management tasks. Being used to phpMyAdmin for MySQL, I wanted something similar for MongoDB. The MongoDB website helpfully lists a number of useful admin tools, and I settled on RockMongo. Like a fair few of the tools listed on that page, there hasn’t been much recent development, which is a concern, and it seemed to do what I needed, and was simple to get running.

Then I enabled authentication on the database…

To cut a long story short, that lack of recent development came back to bite me on the backside, since it turns out there is a bug in the last version of RockMongo which prevents it working with later versions of MongoDB when authentication is enabled. Fortunately I wasn’t the only one to hit the problem, and I was able to resolve it using the fork mentioned in Github issue linked to. One of my longer term to-dos is to investigate alternative tools which provide similar capability and ease of use and have more active development.

 

Injecting Data to MongoDB with Perl

With the database ready, and tested using RockMongo, I was ready to start injecting data from my source, via a Perl script. In terms of ‘schema’ there isn’t much to talk about. I decided on three ‘collections’, the equivalent of tables in the SQL world, which fit my dataset and to be honest probably have some roots in my SQL based thinking. So without further ado, here are some code snippets.

use MongoDB;
use MongoDB::OID;
# Database Connection Information
my $dbname = "database";
my $dbuser = "myuser";
my $dbpass = "mypassword";
my $connection = MongoDB::MongoClient->new(db_name => $dbname, username => $dbuser, 
                                           password => $dbpass);
my $db = $connection->get_database($dbname);
my $regions = $db->get_collection('Regions');
my $competitions = $db->get_collection('Competitions');
my $fixtures = $db->get_collection('Matches');
my $currentmatch = $fixtures->find_one({$matchidtag => $matchid});
if ($currentmatch eq undef)  {
    my $insertresult = $fixtures->insert({$matchidtag => $matchid, *other stuff*});

# The match is in the DB, so update it if anything needs updating
} elsif ($currentmatch ne undef) {
    my $updateresult = $fixtures->update({$matchidtag => $matchid},
                                         {'$set' => {$matchdatatag => $matchdata, 
                                                     *otherstuff*}});
}

Clearly these are just excerpts to give some idea of how it works, some names have been changed to protect the innocent and some further comments are as follows. Not shown is that I define all the field tags upfront, so I don’t have text strings in the body of code itself, and makes it easier to update. Next up is the error handling, which is not included in the above. Part of the reason is that error handling for MongoDB in Perl – what to expect, when, and what to do about it –  is not well documented. This has been raised on the bug tracker here. Finally, my decision to do an explicit ‘read’ and then a separate insert or update is a deliberate choice, even though MongoDB provides the capability to insert on update if a document does not exist. This is because if the document does exist, I want to compare fields in the existing document to data just pulled from the source before deciding whether to update and possibly publish a message to MQTT.

However, as the code snippet shows, the basics are pretty straight forward.

 

Reading Data from MongoDB with PHP

As a spoiler for a future post, I’m using the Slim framework for my REST API, and so the following snippets are extracted from that infrastructure, but the focus will be on the MongoDB interactions.

<?php
/**
 * Database configuration
 */
$db_username = 'myuser';
$db_password = 'mypassword';
$db_host = 'localhost';
$db_port = '27017';
$db_name = 'database';

try {
    $conn = new Mongo("mongodb://$db_user:$db_password@$db_host/$db_name");
    $db = $conn->selectDB("$db_name");
} catch (MongoConnectionException $e) {
    echo "Failed to connect to Database: " . $e->getMessage();
} catch (MongoException $e) {
    echo "Unexpected error when connecting to Database: " . $e->getMessage();
}

// access collection
$collection = $db->Matches;

// execute query
$query = array("matchid" => array('$in' => $matchids));
$cursor = $collection->find($query);

header('Content-Type: application/json');
echo '{"matches": [';

//Output each item as json if there are results in the iterator
if ($cursor->hasNext()){
    foreach ($cursor as $item) {
        echo json_encode ($item);
        if ($cursor->hasNext()) echo ', ';
    }
}
//end Json output
echo ']}';

Once again this is a snippet, with some names changed, put together from several files and not actually what gets executed, but it serves to demonstrate the techniques. While this does show some  error handling for the MongoDB side, it could use improvement, not least with what gets returned to the client in the event of an error, but we shall leave that for another day.

 

Summary

I now have the foundation for the project in place – a database to store the data, a utility that lets me inspect the state of the database and perform basic management tasks, scripts to push data into the database from the source, and a mechanism to read the data back and check it is in order. Next up, crafting the PHP to provide an API for clients to consume the data.