Why Field Level Updates Matter, and How They Work in MongoDB

Intro

One of the advantages of MongoDB over various NoSQL systems and key/value stores is the ability to update individual fields atomically in the same way developers are already used to doing in RDBMS. This is not limited to specific types of operations, and can be used with any value.

We've been using YCSB recently to run some performance comparisons. One nice thing about YCSB is that it was designed to test whether a data store is "read optimized" or "write optimized". This is evident in how updates and reads are performed in the tests. By default YCSB records have ten fields, and in the default mixed workload (50% reads, 50% writes), updates change the value of one randomly selected field, and reads will read the full record.

Read Optimized

If a data store is read optimized, like MongoDB’s MMAP and WiredTiger B-tree storage engines, it does the hard work during the update: looking up the record, changing appropriate fields to their new values, and saving the new record. This happens atomically inside the engine and the application developer doesn't need to worry about it - concurrency correctness is guaranteed by the storage engine. The read then is easy - just read the record as its stored. These engines tend to be much faster at read heavy workloads.

Write Optimized

If the data store is write optimized, like RocksDB and TokuMX, or systems modeled on Google’s BigTable, then writes will persist the update as the new value of the field with a timestamp. This approach does not require reading the record to perform the update, which makes writes much faster. The tradeoff is that reads are slower because they must assemble the full record – the original ten field record must be merged with all of the single field updates that have been applied since the full record was persisted. Timestamps are used to determine what the true "current" value of each field is. In these systems compactions run periodically in the background to rewrite the records in order to improve read performance and avoid storing multiple field values.

Why Field Level Updates are critical for YCSB

Without the ability to atomically set individual fields in a record, updates are a challenge - your only other alternative is to replace the full record. If ten fields can only be replaced in full how can you change a single field while preserving the other fields and not missing other changes to the record that are happening at the same time?

As the first step you would have to read the full record to preserve the nine other fields, add your own new field value, and then replace the record with this new copy. It seems obvious that if you didn't read the record, and just replace the old record with the new field value, you would only have a record with a single field instead of the ten fields you started with.

So you have to read the record to merge your new update with it, but what if another update (or ten) are also trying to do the same thing at exactly the same time? All ten could read the record, replace a different field out of the ten, and replace the old record with the "new" one. Now the tenth update will overwrite the other nine updates! It would be as if you performed ten "updates" really fast, but you actually lost nine of them and only got to keep whichever happens to be the last one!

A Workaround – Compare and Swap

Atomicity and concurrency correctness are handled by the database engine transparently. In contrast, a K/V store that doesn't provide the ability to update fields relies on the application to do all the work - it either has to lock the record it is updating when it first performs the read, or it has to use a "Compare and Swap" method to replace the record only if it hasn't changed since it was first read. If the record has changed, it has to go back to the beginning and start the process all over.

Does the Compare and Swap technique have its place in the RDBMS and MongoDB? Absolutely. Sometimes a shared record needs to be read from the DB in order to show it to an end user, and to allow the end user to edit it. This can take a long time, so it's not a good idea to take a lock (or to start a transaction in a RDBMS). Instead the record can be read, and then when the end user decides to save, check whether the record has been changed in the meantime. If it has, you can show the user the latest values and allow them to verify their changes.

If you can update individual fields, you don't even have to do a full compare, you can use a variant of this technique called "update if current".

When to Use "Update if Current"

There are some updates MongoDB can't do atomically, at least not yet. For example, if you need to update the value of a field based on another value in the document - like creating a new field out of two existing fields - you have to read those fields first and then set the new field in an update, but only if those other fields haven't changed since you read them. You don't have to read or compare the entire document, just the fields that you are basing your update on. We call this approach Update if Current.

YCSB is definitely not perfect - there are many features that it cannot test. However, when considering the goals set forth by its authors, it does a good job of evaluating how read-optimized or write-optimized systems perform on a variety of hardware platforms.

About the Author - Asya
Asya is Lead Product Manager at MongoDB. She joined MongoDB as one of the company's first Solutions Architects. Prior to MongoDB, Asya spent seven years in similar positions at Coverity, a leading development testing company. Before that she spent twelve years working with databases as a developer, DBA, data architect and data warehousing specialist.

Why Field Level Updates Matter, and How They Work in MongoDB

Intro

Read Optimized

Write Optimized

Why Field Level Updates are critical for YCSB

A Workaround – Compare and Swap

When to Use "Update if Current"

Trending Articles

Practice Sheet of Right form of verbs for HSC Students

Download: FK ft Shenky – Nakuyewa ”Prod by: Shenky”

How to win at Markstrat (Markstrat Tips and Tricks) – Vodites

Ominde Commission Report and Recommendations – Ominde Report of 1964

Bureau of Internal Revenue: Regional Offices (Directory)

GO 53 on Enhancement of Ex-gratia upto 5 Lakhs Toddy Tappers in Telangana

Cakewalk CA-2A Leveling Amplifier v2.0.1.97 WiN, v2.0.1.96 OSX Incl Keygen

Mp3 Download: Mdu - Kunjenjenjena

How the kill the job , when DTP request running for long hours.

Microsoft Intune から展開しているアプリのアップデートについて

18-year-old girl was beaten for half an hour by two Northampton men in 'an...

Car crash in Dunton Bassett leaves driver in critical condition

Macky 2, Two Others In Road Accident

Application log 00000000000000089514: Could not convert queue DLVST90CLNT

Detroit mafia: D’Anna Brothers agree to plea deal

Delivery block field greyed out using VA02

Muloraki Au

【個人撮影】スマホのプライベート映像♪「中に出さないで///」カラオケ屋での生ハメ撮りが流出ｗ【リベンジポルノ】＠PornHub

BREAKING NEWS: Diamond Platnumz Is Reported Dead After Ghastly Car Accident

FIAT 500 B0111 B0112