Today I'm building on my first MongoDB discovery and looking at documents in more depth. To start, let's implement Christmas tree purchases in the database. The first task is picking a tree to buy! I searched the database for a tree I liked and used the
findOne() function to return a single tree.
Next I created a customer collection to hold all the people who bought trees:
One thing you may have noticed is that each document has a field called
_id with a value of
ObjectId(...) containing a hex number. These hex digits are not randomly generated. Instead they hold organized information about the document. The first eight hex digits (four bytes) of the
ObjectId are a timestamp of when the id was created. The rest of the id is broken down into three pieces - the machine ID, process ID, and a counter which increments each time an
ObjectId is generated. All these items together create a very reliable unique key (you don't have to worry about collisions, the possibility is so small).
So why am I going into the details of the
ObjectId? These unique ids are commonly used to link documents together by creating a property on one document that contains the id of another document. In the case of the Christmas tree database, I created a collection for purchases. In this collection, each document is linked to both the purchased tree and the customer. I took the
customer document ids and put them in the
You will notice there are some duplicated fields from other collections in the purchase document (such as the
username property). This sort of duplication is frowned upon in a RDBMS, however since there are no JOINs in MongoDB duplication is okay1.
I've demonstrated how to link related documents in MongoDB, making it easy to find a linked document without a
JOIN operation. Let's take a step back and look at the first query I made for picking out a Christmas tree. I called the
explain() function on this query to find useful execution information:
The most important property in the returned JSON object is
totalDocsExamined. Notice that the query looked at every single document in the collection. Now imagine how slow this could be if there were millions of documents in the collection! For anyone who has used databases before the solution should come to mind - an index. Let's add indexes to the commonly queried fields in
You may be wondering about the significance of the value
1. This means that the index is stored in ascending order, while a
-1 means descending order2. When I call
explain() again, only the returned documents are examined. Much better!
All the indexes on a document are displayed with the
Indexes are used for other purposes besides speeding up query times. They can expire documents in a time-to-live (TTL) collection3. These collections use indexes to set a date that a document expires. In order to create a TTL collection, a date property needs to exist on the documents. In the
tree documents I set this date to Christmas eve, since nobody will buy a tree after then.
Next I created an index on the
availableUntil property. The second parameter of
createIndex() contains additional options, in this case expiring the document zero seconds after the date in
I applied a lot of new MongoDB concepts to the tree database. The power of linked documents and indexes in MongoDB is now clear. I will look at MongoDB even more in my next discovery. The code for this discovery can be found on GitHub.