Simon Bisson
Contributor

Azure Cosmos DB gets the multiple-master boost for distributed apps

analysis
Oct 09, 20186 mins
Cloud ComputingDatabasesMicrosoft Azure

Microsoft’s cloud database platform is updated to improve distributed systems performance at a global scale

Building distributed systems is hard. When you’re working with applications that span a planet, the speed of light is a brake on what you want to do, complicating data replication among servers and services. Someone buys a widget in Hong Kong at almost the same time as someone in Paris, but there’s only one in stock. How do you know who to bill, and who to tell the purchase failed? Whose purchase ends up being the one recorded in your line-of-business tools?

Azure as a distributed systems platform

Microsoft’s Azure drives work in distributed system design, powering services built around stateless microservices. Managing applications that work across this fabric requires two things: an effective way of handling messaging traffic and a way of handling distributed data at scale.

Microsoft Research’s work on its Orleans virtual actor framework handles the messaging side of the equation, both on its own in services like Xbox and as the basis for the actor/message patterns used in Azure Service Fabric. Meanwhile the data side is handled by Cosmos DB, Microsoft’s global-scale distributed database.

Cosmos DB is a fascinating piece of work, especially when you look at it as a practical implementation of some very complex computer science concepts. Instead of only offering either eventual or strong consistency models, like other cloud-scale systems, it offers a balance between accuracy and latency with three other models that map more closely to application design patterns and intentions. All five consistency models have high service levels, quickly replicating data among different instances and reducing the risk of conflicts. It’s these conflicts that cause issues like multiple purchases of the same item—issues that can’t be allowed to affect your business.

Multiple-master support comes to Cosmos DB

Until recently, conflicts were handled by only allowing writes to a single region while allowing reads from across the network. Now, however, Cosmos DB can handle operating in a multiple-master mode, with writes in multiple regions. That makes the service much more scalable, because updates are handled by the Cosmos DB network as a whole—while still supporting the service’s consistency models. Applications can write data to the nearest node, reducing latency, while Cosmos DB maintains its existing SLAs, so there should be no change to how quickly that data can be read.

Where things get interesting is when you start looking at how Cosmos DB handles conflict resolution, when you can write to any replica at any time. Like all databases, it has to handle three types of conflict: insert, replace, and delete. Insert conflicts happen when a record is inserted at the same time in multiple regions, with the same index. Replace conflicts are similar, though here existing records are updated. Similarly, delete conflicts can occur if a record is deleted from one region and updated from others.

Handling conflict resolution at scale

To prevent these conflicts, Cosmos DB implements three different mechanisms: Last Writer Wins, User-Defined Procedures, and Asynchronous. All three are available for its SQL API, but only one for its NoSQL APIs. You define the model you want to use when setting up a collection.

Most operations are handled by a Last Writer Wins mechanism (the only method available for NoSQL operations). Each write contains a numeric value that’s generated on write, using functions built into Cosmos DB and stored in the conflict-resolution path for the entry, usually a time stamp from the server that handles the operation. If a conflict occurs, the operation that has the largest value associated with it wins, if you’re inserting or updating data. Deletes always win, no matter what conflict-resolution data is part of the operation. In practice, you’re unlikely to have operations that have the same conflict-resolution path, but if they do, Cosmos DB eventually converges on a single winner, but there’s no way of predicting which one it might be.

Things get more complex with the other conflict-resolution methods, and you’re going to have to write stored procedures inside Cosmos DB to handle them using its internal JavaScript functions.

The User-Defined Procedures method builds on Last Writer Wins to trigger your own custom actions; for example, creating a collection that stores all conflicting documents before another process determines which version is valid, or automatically storing all new writes while applying custom logic to updates. You can choose the actions your code offers, depending on the type of data you’re storing and how you expect it to be used. In practice, you’re likely to implement a similar model to services like OneDrive where a conflict in the application prompts a user to retry an action or to change certain key information before data is saved.

The Asynchronous method goes one step further, storing all conflicts in a read-only feed that can be processed by your application. You access the conflict feed via the CosmosDB APIs, and you can apply different actions for each operation type. By taking conflicts and handling them asynchronously, you keep latency as low as possible for most operations, only slowing things when conflicts occur. The result is a relatively free flow of data, with conflicts processed as your code polls the conflict feed and then applies your conflict-resolution rules.

Beyond performance: taking Cosmos DB to the edge

So why has Microsoft implemented multiple masters in Cosmos DB? The obvious answer is performance. By allowing writes to take place as close to a user as possible, you can significantly reduce latency. Using this with new global load-balancing tools like Azure Frontdoor, you can automatically direct user data to your nearest Cosmos DB instance. A user from the UK visiting the US won’t need to send data back over the Atlantic, so you both give them the experience they expect and keep your bandwidth costs to a minimum.

There’s another tantalizing prospect: multiple-master mode lets Microsoft bring Cosmos DB to the edge of the network. It’s unlikely that you’ll run Cosmos DB on your own Windows servers, but there’s now an option for Azure to make it part of its own edge extensions: running on Azure Stack or on Azure IoT Edge. There’s also the possibility of using it as part of new hardware like the Data Box Edge data-upload tool, processing data from your servers and uploading it to Cosmos DB as part of a cloud ETL service.

With Cosmos DB running on Azure Stack, you can quickly load data from on-premises services and have it made available globally. A large retailer could use it to extend its warehouse stock control systems into its e-commerce service, giving users a better idea of both product availability and of expected delivery. Similarly, a large gas station company could use Cosmos DB instances to quickly share pump telemetry across the organization via IoT Edge.

With Cosmos DB a foundational tool for Azure’s own services, bringing it to the edge of the network should make delivering those complex distributed applications far easier to build and deploy.

Simon Bisson
Contributor

Author of InfoWorld's Enterprise Microsoft blog, Simon Bisson prefers to think of “career” as a verb rather than a noun, having worked in academic and telecoms research, as well as having been the CTO of a startup, running the technical side of UK Online (the first national ISP with content as well as connections), before moving into consultancy and technology strategy. He’s built plenty of large-scale web applications, designed architectures for multi-terabyte online image stores, implemented B2B information hubs, and come up with next generation mobile network architectures and knowledge management solutions. In between doing all that, he’s been a freelance journalist since the early days of the web and writes about everything from enterprise architecture down to gadgets.

More from this author