Which Presents Should Be Packed Together?
In the past, one magical bag fit all, but Santa learned that efficiently grouping presents into several different bags can really reduce his carbon footprint and save on reindeer overhead. And since Christmas is celebrated at different times across the globe, Santa needs to follow the sun when delivering presents. The number of bags and their contents is one consideration, but so is when and where. Santa knows, time zone
is an important field to take into account:
db.presents.update( { "location.address.country": {"$in": ["PT", "UK", "IR" ] } }, {"$set": {"location.time_zone": "WET"}} )
{code}
Because MongoDB has a very flexible data structure, Master Elf DBA can easily make changes to the application without migrating the data. Santa and his team can easily adopt the evolving needs of the holiday season without ever asking the world to reschedule Christmas:
{
"_id": 123300410230,
"name": "Norberto Leite",
"location":{
"geo": {
"type": "Point",
"coordinates": [ -8.611720204353333, 41.14341253682037]
},
"address" : {
"country": "PT",
"city": "Porto",
"street": "Rua Escura 1",
"zip": [4000, "Porto"],
"time_zone": "WET"
},
},
}
With this information set on our schema Santa can now allocate presents into bags for efficient delivery in each time zone.
Santa needs to make sure that the Elves organize the bags optimally. Elves are very cool developers, true hackers, and know that MongoDB offers an aggregation framework that allows them to perform real-time analytics and demanding aggregation queries. The Elves collect information on how to group presents by time zone using a document:
{
"_id": 123300410230,
"name": "Norberto Leite",
"location":{
"geo": {
"type": "Point",
"coordinates": [ -8.611720204353333, 41.14341253682037]
},
"address" : {
"country": "PT",
"city": "Porto",
"street": "Rua Escura 1",
"zip": [4000, "Porto"],
},
"time_zone": "WET"
},
"present": {
"type": "game console",
"brand": "NotPlaystation",
"name": "PS",
"model": "4"
},
}
The Elves can then run queries that collect and group the presents based on time zone and present name:
//collect all presents for Santa's Bag for EST time zone
db.presents.aggregate( [
{ "$match": { "location.time_zone": "EST" } } ,
// group and count the number of presents per present
{"$group" : { "_id": "$present.name", "numberOf": {"$sum":1} } }
]
)
This flexibility and power makes the Elves feel very good about themselves!
Santa Reads Each Letter One By One
First of all, paper letters are not in fashion anymore. Santa uses email.
For the traditionalists that like to send paper letters we’re pretty sure that Santa is using a MongoDB client with technology to process ordinary paper letters, digitize them and deliver them via email. (In addition to email you can also reach Santa by Twitter or even by Facebook, although you want to keep your present requests private and away from social criticism, be careful about that!)
Santa will decide if you deserve your present requests by checking if you’ve been a good boy or girl on your Facebook and Twitter profiles! OK, that’s not true. Santa asks your parents for feedback -- Parental Control 101! Be good to your mom and dad and you’ll get your presents. Santa uses your social profile to get in touch with your parents and asks them to validate that you have been a good boy or girl this year:
{
db.presents.update( {"_id": 123300410230}, {"$set": { "present.approved": { "whom": "dad", "when": ISODate("2014-12-18T11:27:05.228Z") } }})
}
Santa will only deliver the presents that have approval. Therefore we need to check which presents have the approval field:
db.presents.find( {"present.parents.approved" : {"$exists":1} })
Santa and His Elves All Live in Lappland
We all know that the real Santa Claus lives in Finnish Lappland. But like MongoDB, Santa decided to have personnel all over the Globe. There are lots of advantages, including:
- Engagement with local culture
- Removal of language barriers
- More efficient distribution of presents
- Less fuel consumption (reindeer also produce greenhouse gas emissions!)
Animal and Wildlife agencies were also pressuring Santa to reduce the stress caused by constant jet-lag endured by the reindeer.
From an operational perspective these seem like best practices. But how about data? How would the beach bum Elves, who work from sunny Barcelona, be able to efficiently access the data that tells them what needs to be bagged for Europe? How would the parents in Australia be able to approve their children’s requests given that data would need to travel all the way back to Santa’s Lappland? Well, for that MongoDB also comes to the rescue.
MongoDB offers different ways to partition data and distribute load across data centers through a technique called sharding. To efficiently accomplish this we need to select a good shard key. Santa and Master Elf (certified MongoDB DBA) sought out to analyse the types of queries (functionality requested) and understand how to best distribute data (load distribution) across all aspects of the application. This was their line of thought:
- Presents will be put in bags according to time zone (distribution of load)
- Elves that work on the bagging process only need access to data for their time zone (read isolation)
- Parents need to approve their children’s requests, and we want to minimize the latency for that operation (local writes)
- New present requests and parent operations must be very fast and processed concurrently (write distribution)
- There’s risk we might have saturation on different time zones, so we might need to have more nodes per time zone (capacity distribution)
- Santa’s present intake dashboard should perform in the same way if all data is at Lappland (local reads)
Given this scenario, Santa asked master Elf to shard data based on {time_zone} and {_id}. This will give a combination of locality and high cardinality. Sharded data will also reflect the majority of queries of the system, allowing a good, effective distribution of data and load.
Master Elf created the shard cluster through the following procedure:
- Launched config servers (on distributed and time zones data centers)
- Launched multiple mongos processes in each of Santa’s data centers (more than one for mongos failover!)
- Added the existing replica to the shard cluster
- Redirected Santa’s presents management application to the new mongos processes
Then finally Master Elf shared the presents collection:
//enable sharding on the 2014 presents database
sh.enableSharding( "2014");
//let's not forget to create the appropriate index
db.presents.ensureIndex( { "location.time_zone":1, "_id":1 })
//set the shard key on the presents collection
sh.shardCollection("2014.presents", { "location.time_zone":1, "_id":1 } )
Obviously Master Elf did all of this work from the shell. But he did not need to do that! Master Elf could have simply connected to his MongoDB Management Service (MMS) account and deployed a sharded cluster following just a couple of clicks. But that my friends, is material for next year’s story!
Santa Claus Must Be Loaded!
Well, Santa Claus is obviously a very generous person but that does not make him a crazy spender! Like any good CTO or CEO, Santa wants to make sure he keeps to the yearly budget while still providing the elves with all the right tools and infrastructure they need to do their jobs.
By having everyone certified as either MongoDB developers or DBAs, Santa avoids any issues during the holidays. Santa also cuts costs by reducing infrastructure spend during the off season. Santa has an elastic infrastructure that allows him to deploy more nodes efficiently to accommodate the seasonal nature of the application. He holds two replicas of all time zones in Lappland, having a local primary in each time zone data center, but also shutting down primary during the off season. This allows Santa to only deploy processing power when needed.
Santa Uses MongoDB!
Off course he does! Why do think Elves dress in green? Considering all the cool features that MongoDB offers:
- Geo Distribution of Data
- Write Distribution
- Read Isolation
- Geospatial indexes
- Aggregation Framework
- Dynamic Schema
- Why wouldn’t he!?
If you're interested in learning more about how Santa put together his Christmas plan, download the MongoDB Architecture guide here:
About Norberto Leite
Norberto Leite is Technical Evangelist at MongoDB. Norberto has been working for the last 5 years on large scalable and distributable application environments, both as advisor and engineer. Prior to MongoDB Norberto served as BigData Engineer at Telefonica.