Kuaidi uses MongoDB at the heart of its taxi hailing service, connecting drivers with passengers up to 6 million times a day, and managing nearly half a billion orders. Kuaidi has scaled MongoDB across 4 geographic regions, serving thousands of reads and writes every second.
Following his presentation at last month’s MongoDB Day in Beijing, I sat down with Ouyang Kang, Chief Architect at Kuaidi, to learn more about how China’s leading taxi booking application is using MongoDB, and his recommendations for those getting started with the database.
Smartphone based taxi-calling and ride-sharing services are growing at an astounding rate – attracting significant investment (and huge company valuations). They are also intensely competitive. The choice of technology will ultimately drive success or failure in the market. In the world’s most populous country – and one suffering the most severe traffic congestion – the importance of using agile and scalable technology for transportation services is magnified.
Please start by telling us a little bit about your company
Kuaidi was founded in 2012 and has grown to become Greater China’s largest car service application1, attracting investment from Alibaba and Matrix Partners. In just 2 years, we have attracted 100 million users who place up to 6 million ride requests every day via our smartphone app, connecting them to 3 million drivers in more than in 300 cities across China. And we are continuing to grow fast.
The goal of Kuaidi Group is to improve the efficiency of urban transportation and the population’s quality of life. We currently operate 2 branded services – Kuaidi Taxi and Kuaidi ONE – which provide taxi and chauffeured limousine services respectively. Our long term plan is to offer services for every facet of passenger transportation combining location-based mobile technologies, data mining of our huge user base and intelligent routing algorithms.
Tell us how you use MongoDB
At heart of our taxi booking application is the location based service, and we rely on MongoDB for this. Using MongoDB’s geospatial indexes and queries we can track the location of our drivers in real time, using it to connect users with their closest taxi, and displaying updates directly to the customer’s app. The location data is constantly being updated and queried.
We also use MongoDB as an active archive of our order data. Each time a customer requests a taxi, the journey’s start and end points, the driver identity and fare are stored in a single record. We initially built our archive on top of MySQL, but once our order volume exceeded 100 million records, we hit scaling limits. We knew MongoDB scaled, so we migrated the archive to get the cost and performance benefits of horizontal scale out.
What other databases do you use?
We use Redis for caching and MySQL to store operational customer and order data. We also replicate data from MongoDB and MySQL into Hadoop for data mining and analytics.
Did you consider other databases for your app? What made you select MongoDB?
We considered three options for our location based service:
- Relational solutions based on MySQL and Postgres
- SOLR (for the search element of the application)
- MongoDB
We evaluated each on multiple criteria, including
- Performance. We measure performance on multiple dimensions: latency, which is critical for good user experience on mobile apps; and speed of real time updates, so we are always working from the freshest data
- Scalability. We were confident that the service would quickly gain traction, so knowing we could scale our database on demand was paramount
- Ease-of-Use. We needed to achieve our performance and scalability goals without burdening our developer and operations team with complexity
We evaluated all of the options on this criteria, and found MongoDB to be the best choice for us. It met the performance objectives. We found it easy to develop against. What was really important was that it proved easy to deploy and easy to run at scale.
Please describe your MongoDB deployment
Our MongoDB database is sharded across four geographic regions. A 7-node replica set is deployed in each region (6 data-bearing nodes and an arbiter). This deployment enables us to place data physically closer to local users for low latency access, as well as provide the scalability and resilience our application needs. We cannot tolerate downtime at all. We use Nagios for monitoring the application and database.
Geo-Distributed MongoDB Deployment at Kuaidi
We are running MongoDB 2.6 with the Java driver.
Are there any metrics you can share?
Yes.
- MongoDB is serving 50,000 operations per second (split 80:20 between reads and writes)
- Our database has grown to just under half a billion documents and continues to scale
Do you have plans to use MongoDB for other applications?
Our marketing team stores all of its promotions and messaging in MySQL, but is starting to hit scaling limits. As a result, it is not keeping pace with their demands. We are evaluating migrating this to MongoDB as well.
What feature of the forthcoming MongoDB 2.8 release are you most looking forward to?
It has to be document level concurrency control. As our service continues to grow, we need to scale to keep pace – especially writes. This is something we believe MongoDB 2.8 with its new WiredTiger storage engine will allow us to do.
What advice would you give someone who is considering using MongoDB for their next project?
Don’t just follow the crowd. Don’t just choose the same technology you have always chosen. There is so much innovation happening today, and the databases of the last decade are not always the right choice.
Once you have a short-list of potential technologies, test them with your app, your queries and your data. It is the only way to be sure you are choosing the right technology going forward.
Ouyang, thank you for your time, and sharing your experiences with the MongoDB community.
Thinking about migrating from a relational database? Download the MongoDB whitepaper to get started:
1Based on market share and transaction volume