MongoDB enables you to meet the demands of modern apps with a technology foundation that enables you through:
- The document data model – presenting you the best way to work with data.
- A distributed systems design – allowing you to intelligently put data where you want it.
- A unified experience that gives you the freedom to run anywhere – future-proofing your work and eliminating vendor lock-in.
Building on the foundations above, MongoDB 4.0 is a significant milestone in the evolution of MongoDB, and we’ve just shipped the first Release Candidate (RC), ready for you to test.
Why is it so significant? Let’s take a quick tour of the key new features. And remember, you can learn about all of this and much more at MongoDB World'18 (June 26-27).
Multi-Document ACID Transactions
Previewed back in February, multi-document ACID transactions are part of the 4.0 RC. With snapshot isolation and all-or-nothing execution, transactions extend MongoDB ACID data integrity guarantees to multiple statements and multiple documents across one or many collections. They feel just like the transactions you are familiar with from relational databases, are easy to add to any application that needs them, and and don't change the way non-transactional operations are performed. With multi-document transactions it’s easier than ever for all developers to address a complete range of use cases with MongoDB, while for many of them, simply knowing that they are available will provide critical peace of mind that they can meet any requirement in the future. In MongoDB 4.0 transactions work within a replica set, and MongoDB 4.2 will support transactions across a sharded cluster*.
To give you a flavor of what multi-document transactions look like, here is a Python code snippet of the transactions API.
with client.start_session() as s:
s.start_transaction():
try:
collection.insert_one(doc1, session=s)
collection.insert_one(doc2, session=s)
except:
s.abort_transaction()
raise
s.commit_transaction()
And now, the transactions API for Java.
try (ClientSession clientSession = client.startSession()) {
clientSession.startTransaction();
try {
collection.insertOne(clientSession, docOne);
collection.insertOne(clientSession, docTwo);
clientSession.commitTransaction();
} catch (Exception e) {
clientSession.abortTransaction();
}
}
Our path to transactions represents a multi-year engineering effort, beginning over 3 years ago with the integration of the WiredTiger storage engine. We’ve laid the groundwork in practically every part of the platform – from the storage layer itself to the replication consensus protocol, to the sharding architecture. We’ve built out fine-grained consistency and durability guarantees, introduced a global logical clock, refactored cluster metadata management, and more. And we’ve exposed all of these enhancements through APIs that are fully consumable by our drivers. We are feature complete in bringing multi-document transactions to replica sets, and 90% done on implementing the remaining features needed to deliver transactions across a sharded cluster.
Take a look at our multi-document ACID transactions web page where you can hear directly from the MongoDB engineers who have built transactions, review code snippets, and access key resources to get started.
Aggregation Pipeline Type Conversions
One of the major advantages of MongoDB over rigid tabular databases is its flexible data model. Data can be written to the database without first having to predefine its structure. This helps you to build apps faster and respond easily to rapidly evolving application changes. It is also essential in supporting initiatives such as single customer view or operational data lakes to support real-time analytics where data is ingested from multiple sources. Of course, with MongoDB’s schema validation, this flexibility is fully tunable, enabling you to enforce strict controls on data structure, type, and content when you need more control.
So while MongoDB makes it easy to ingest data without complex cleansing of individual fields, it means working with this data can be more difficult when a consuming application expects uniform data types for specific fields across all documents. Handling different data types pushes more complexity to the application, and available ETL tools have provided only limited support for transformations. With MongoDB 4.0, you can maintain all of the advantages of a flexible data model, while prepping data within the database itself for downstream processes.
The new $convert operator enables the aggregation pipeline to transform mixed data types into standardized formats natively within the database. Ingested data can be cast into a standardized, cleansed format and exposed to multiple consuming applications – such as the MongoDB BI and Spark connectors for high-performance visualizations, advanced analytics and machine learning algorithms, or directly to a UI. Casting data into cleansed types makes it easier for your apps to to process, sort, and compare data. For example, financial data inserted as a long can be converted into a decimal, enabling lossless and high precision processing. Similarly, dates inserted as strings can be transformed into the native date type.
When $convert
is combined with over 100 different operators available as part of the MongoDB aggregation pipeline, you can reshape, transform, and cleanse your documents without having to incur the complexity, fragility, and latency of running data through external ETL processes.
Non-Blocking Secondary Reads
To ensure that reads can never return data that is not in the same causal order as the primary replica, MongoDB blocks readers while oplog entries are applied in batches to the secondary. This can cause secondary reads to have variable latency, which becomes more pronounced when the cluster is serving write-intensive workloads. Why does MongoDB need to block secondary reads? When you apply a sequence of writes to a document, then MongoDB is designed so that each of the nodes must show the writes in the same causal order. So if you change field "A" in a document and then change field "B", it is not possible to see that document with changed field "B" and not changed field "A". Eventually consistent systems suffer from this behavior, but MongoDB does not, and never has.
By taking advantage of storage engine timestamps and snapshots implemented for multi-document ACID transactions, secondary reads in MongoDB 4.0 become non-blocking. With non-blocking secondary reads, you now get predictable, low read latencies and increased throughput from the replica set, while maintaining a consistent view of data. Workloads that see the greatest benefits are those where data is batch loaded to the database, and those where distributed clients are accessing low latency local replicas that are geographically remote from the primary replica.
40% Faster Data Migrations
Very few of today’s workloads are static. For example, the launch of a new product or game, or seasonal reporting cycles can drive sudden spikes in load that can bring a database to its knees unless additional capacity can be quickly provisioned. If and when demand subsides, you should be able to scale your cluster back in, rightsizing for capacity and cost.
To respond to these fluctuations in demand, MongoDB enables you to elastically add and remove nodes from a sharded cluster in real time, automatically rebalancing the data across nodes in response. The sharded cluster balancer, responsible for evenly distributing data across the cluster, has been significantly improved in MongoDB 4.0. By concurrently fetching and applying documents, shards can complete chunk migrations up to 40% faster, allowing you to more quickly bring new nodes into service at just the moment they are needed, and scale back down when load returns to normal levels.
Extensions to Change Streams
Change streams, released with MongoDB 3.6, enable developers to build reactive, real-time, web, mobile, and IoT apps that can view, filter, and act on data changes as they occur in the database. Change streams enable seamless data movement across distributed database and application estates, making it simple to stream data changes and trigger actions wherever they are needed, using a fully reactive programming style.
With MongoDB 4.0, Change Streams can now be configured to track changes across an entire database or whole cluster. Additionally, change streams will now return a cluster time associated with an event, which can be used by the application to provide an associated wall clock time for the event.
Getting Started with MongoDB 4.0
Hopefully this gives you a taste of what’s coming in 4.0. There’s a stack of other stuff we haven’t covered today, but you can learn about it all in the resources below.
To get started with the RC now:
- Head over to the MongoDB download center to pick up the latest development build.
- Review the 4.0 release notes.
- Sign up for the forthcoming MongoDB University training on 4.0.
And you can meet our engineering team and other MongoDB users at MongoDB World'18 (June 26-27).
* Safe Harbor Statement
This blog post contains “forward-looking statements” within the meaning of Section 27A of the Securities Act of 1933, as amended, and Section 21E of the Securities Exchange Act of 1934, as amended. Such forward-looking statements are subject to a number of risks, uncertainties, assumptions and other factors that could cause actual results and the timing of certain events to differ materially from future results expressed or implied by the forward-looking statements. Factors that could cause or contribute to such differences include, but are not limited to, those identified our filings with the Securities and Exchange Commission. You should not rely upon forward-looking statements as predictions of future events. Furthermore, such forward-looking statements speak only as of the date of this presentation.
In particular, the development, release, and timing of any features or functionality described for MongoDB products remains at MongoDB’s sole discretion. This information is merely intended to outline our general product direction and it should not be relied on in making a purchasing decision nor is this a commitment, promise or legal obligation to deliver any material, code, or functionality. Except as required by law, we undertake no obligation to update any forward-looking statements to reflect events or circumstances after the date of such statements.