Pull to refresh

Evaluating Performance: CosmosDB vs. Azure SQL

Level of difficultyEasy
Reading time4 min
Views5.3K

Introduction

In the evolving landscape of database technology, choosing the right database management system is crucial for the efficiency and scalability of applications. This article compares the performance between Microsoft's CosmosDB and MS SQL Server. We'll examine how each database performs under various load conditions and share some interesting findings.

CosmosDB vs. Azure SQL Server: Understanding the Differences

CosmosDB: A Multi-Model Database Service

CosmosDB stands out with its multi-model approach, supporting document, key-value, graph, and column-family data models. Its global distribution architecture and multi-master replication feature are designed to offer high availability and low latency for globally distributed applications.

Key Features:

  • Global Distribution: CosmosDB is inherently designed for global distribution, enabling data to be replicated in multiple regions worldwide.

  • Multi-Write Replicas: Offers support for multiple write and read replicas, enhancing data redundancy and availability.

  • Scalability: High scalability for handling large volumes of data with efficient resource management.

Azure SQL Server: A Relational Database Veteran

Azure SQL Server, a cloud generation of the long-established relational DBMS, is known for its robust performance, security features, and complex query capabilities. It's highly preferred for traditional enterprise applications requiring complex transactions and query processing.

Key Features:

  • Performance: Exceptional performance for complex queries and transaction processing.

  • Security: Advanced security features for data protection and access control.

  • Programmability: Azure SQL offers extensive programmability, enabling users to write complex stored procedures, functions, and triggers, enhancing database functionality and automation.

Azure SQL can be purchased in two pricing models: DTU(Database Transaction Units) and vCore. There is also a serverless offering, but all the serverless option does is turn off the database when it's not used at all, and there are no active connections, which is very unlikely for the real production application.

Cosmos DB for NoSql has only one purchase model: RU per second(request unit). This approach aims for more consistent performance as each specific request, depending on the data amount, consistently uses a set number of RUs. If a demand exceeds the database's capacity, the database responds with a 429 HTTP status code, indicating the need to retry later. Cosmos DB's SDK automatically handles this error by retrying the request. Cosmos DB offers a serverless model where you're billed only for the RUs you use. This can be pretty beneficial for applications with variable or unpredictable loads.

Performance Comparison: The Setup

Two instances were created with equivalent financial investment to compare the performance of CosmosDB and MS SQL Server objectively.

Here are the parameters for the CosmosDB:

The monthly cost for this CosmosDB configuration is $446.20

The monthly cost for this CosmosDB configuration is $456.25

Performance Comparison: The Approach

We will compare bulk insert operation, single insert operation and reading 20 rows by index. The tests for both databases will be conducted using a .NET application. For Cosmos DB, its native SDK will be utilized, while for Azure SQL, the Entity Framework will be employed.

For test purposes, we will use an object with 25 properties of different types with a primary key and four non-unique indexes. The database size is 1 GB.

For all the Cosmos DB inserts EnableContentResponseOnWrite = false is set to speed up the operations.

Bulk insert

10,000 records will be inserted. For Cosmos DB, AllowBulkExecution is enabled. This option combines multiple queries into a single request, increasing latency but improving the throughput. For the Azure SQL Entity Framework Extensions will be used to make a bulk insert.

Individual inserts

10,000 records will be inserted in 1 and 10 threads and 100,000 in 100 threads, emulating workload from different numbers of users simultaneously working with the application.

Select

10 rows will be selected by index each time. 10,000 select will be done in 1 and 10 threads.

Performance Comparison: The Results

Here are the total times and average time per operation in milliseconds. Both times reflect the wall time, not the thread time, so an increasing number of threads should reduce per operation time.

Cosmos DB

Azure SQL

Bulk Insert 10,000

16589 / 1.6

3951 / 0.4

Insert 10,000 1 thread.

57998 / 5.7

28064 / 2.8

Insert 10,000 10 threads.

54361 / 5.4

7154 / 0.7

Insert 100,000 100 threads.

129186 / 12.9

78304 / 7.8

Select 10,000 1 thread.

91371 / 9.1

34810 / 3.4

Select 10,000 10 threads.

39492 / 3.9

16279 / 1.6

For Cosmos DB, we can estimate the minimum time needed for an operation. In our setup with 5500 RU/s, a single insert operation consumes 7.05 RU, allowing for about 780 operations every second. This translates to a minimum possible time of around 12.8 seconds. For bulk operations, the actual time taken is about 20% longer than this minimum, but for individual operations, the delay is significantly greater, and RUs don't restrict the throughput.

Interestingly, increasing the number of threads didn't enhance the performance of Cosmos DB. Even with more than half of the RU/s remaining unused, this didn't lead to any significant speed-up in the process.

Compared to Azure SQL, Cosmos DB is significantly slower for all the operations.

Conclusion: Choosing the Right DBMS for Your Needs

While MS SQL Server shows superior performance for the same monetary investment, it's crucial to consider the specific requirements of your application. CosmosDB, with its multi-region and multi-master features, is an ideal choice for applications requiring global distribution and high availability.

Ultimately, the decision should be based on a thorough analysis of the application's requirements, budget, and scalability needs. In scenarios where global distribution and multi-master capabilities are not a priority, MS SQL Server may offer a more cost-effective and performance-oriented solution.

Tags:
Hubs:
Total votes 3: ↑3 and ↓0+3
Comments0

Articles