MySQL Cluster 7.4 GA: 200 Million QPS, Active-Active Geographic Replication and more

Highlights

The MySQL team at Oracle are excited to announce the General Availability of MySQL Cluster 7.4, in other words – it’s now ready for production workloads.

This is a release which takes what was already great about MySQL Cluster (real-time performance through memory-optimized tables, linear scale-out with transparrent sharding and cross-shard joins, High Availability and SQL as well as NoSQL interfaces) and makes it even faster, easier to manage and simpler to run accross geographies.

Specifically, it includes the following features:

  • Performance
    • 200 Million NoSQL Reads/Sec
    • 2.5 Million SQL Ops/Sec
    • 50% Faster Reads
    • 40% Faster Mixed Read/Write Transactions
  • Active-Active Geographic Replication
    • Active-Active Geographic Redundancy
    • Automated Conflict Detection/Resolution
  • Management
    • 5X Faster Maintenance Ops
    • Detailed Reporting

Performance Enhancements

MySQL Cluster 7.4 builds on the huge performance and scalability improvements delivered in MySQL Cluster 7.3. This release has focussed on performance improvements to help two types of workload:

  • OLTP (On-Line Transaction Processing): Memory-optimized tables provide sub-millisecond low latency and extreme levels of concurrency for OLTP workloads while still providing durability; they can also be used alongside disk-based tables
  • Ad-hoc Searches: MySQL Cluster has increased the amount of parallelism that can be used when performing a table scan – providing a significant speed-up when performing searches on un-indexed columns. Note that the huge speedup of joins in earlier releases has also made MySQL Cluster much more suitable for running analytics

A number of benchmarks have been run to assess:

  • How MySQL Cluster 7.4 performance compares with previous releases
  • How SQL performance scales as more data nodes are added
  • How NoSQL performance scales as more data nodes are added

Benchmarking Performance against earlier releases

MySQL Cluster 7.4 read performance compared with earlier releasesFigure 1: MySQL Cluster 7.4, 50% Faster Reads

The Sysbench benchmark tool has been used to perform an apples-to-apples comparison of how a single data node’s performance increases from MySQL Cluster 7.2, through MySQL Cluster 7.3 and MySQL Cluster 7.4. The tests were performed using a 48 core/96 thread machine (also demonstrating how well MySQL Cluster can now scale up with large numbers of cores).

As can be seen in Figure 1 there is a 1.5X performance improvement over MySQL Cluster 7.3 and an even larger improvement over 7.2.

Note that table scans experience a particularly good speedup.

MySQL Cluster 7.4 read/write performance compared with earlier releasesFigure 2: MySQL Cluster 7.4, 40% Faster Read/Writes

Figure 2 illustrates the same tests but for the Sysbench read/write SQL benchmark. In this case, a 1.4X performance improvement over MySQL Cluster 7.3 is recorded. Again, the improvement over MySQL Cluster 7.2 is even higher.

Benchmarking Scaling SQL Performance

MySQL Cluster 7.4 SQL read performance with DBT2Figure 3: 2.5 Million SQL Read Operations per Second

The DBT2 Benchmark has been used to assess how well SQL performance scales as more data nodes are added. As can be in Figure 3 the scaling of SQL reads is almost linear and with 16 data nodes, a throughput of 2.5 Million SQL operations per second is achieved. This equates to around 5 Million Transactions Per Second or 2.2 Million NewOnly TPM.

This benchmark was performed with each data node running on a dedicated 56 thread Intel E5-2697 v3 (Haswell) machine.

Benchmarking Scaling NoSQL Performance

MySQL Cluster 7.4 NoSQL read performance with flexAsynchFigure 4: 200 Million NoSQL Read Operations per Second

The flexAsynch benchmark has been used to measure how NoSQL performance scales as more data nodes are added to the cluster. These tests were performed on the same hardware as the DBT2 benchmark above but scaled out to 32 data nodes (out of a maximum supported 48).

The results are shown in Figure 4 and again it can be observed that the scaling is virtually linear. At 32 data nodes, the throughput hits 200 Million NoSQL Queries Per Second.

Note that the latest results and a more complete description of the tests can be found at the MySQL Cluster Benchmark page.

Full Active-Active Geographic Replication

MySQL Cluster provides Geographic Replication, allowing the same data to be accessed in clusters located in data centers separated by arbitrary distances. This reduces the effects of geographic latency by pushing data closer to the user, as well as providing a capability for geographic redundancy and disaster recovery.

Geographic replication is designed around an Active/Active technology, so if applications are attempting to update the same row on different clusters at the same time, the conflict can be detected and resolved. This ensures that each site can actively serve read and write requests while maintaining data consistency across the clusters. It also eliminates the overhead of having to provision and run passive hardware at remote sites.

If replicating between a single pair of clusters then Active-Active (update anywhere) replication has become significantly simpler and more complete in recent releases, culminating in the final MySQL Cluster 7.4 solution with these added advantages:

  • Developers need to make no changes to the application logic or tables
  • Conflict-triggered rollbacks can be made to whole transactions rather than just individual operations
  • Transactions that are dependent on rolled-back transactions can also be rolled back
  • Conflicts involving reads, writes and deletes can all be detected and resolved

These enhancements make it much simpler and safer to deploy globally scaled services across data centers.

MySQL Cluster allows bi-directional replication between two (or more) clusters. Replication within each cluster is synchronous but between clusters it is asynchronous which means the following scenario is possible:

Conflict with asynchronous replication
Site AReplicationSite B
x == 10x == 10
x = 11x = 20
— x=11 –>x == 11
x==20<– x=20 —

In this example a value (column for a row in a table) is set to 11 on site A and the change is queued for replication to site B. In the mean time, an application sets the value to 20 on site B and that change is queued for replication to site A. Once both sites have received and applied the replicated change from the other cluster site A contains the value 20 while site B contains 11 – in other words the databases are now inconsistent.

A description of how MySQL Cluster detects and then resolves such conflicts can be found in this article, together with a worked example of configuring and testing the conflict detection and resolution.

Faster Maintenance Activities

In typical systems, around 30% of all downtime is attributable to scheduled maintenance activities – for a truly High Availability solution, that downtime must be avoided. MySQL Cluster supports all of the following events as online operations, ensuring the database continues to provide service:

  • Scaling the cluster by adding new nodes
  • Updating the schema with new columns, tables and indexes
  • Re-sharding of tables across data nodes to allow better data distribution
  • Performing back-up operations
  • Upgrading or patching the underlying hardware and operating system
  • Upgrading or patching MySQL Cluster, with full online upgrades between releases

Many of these operations require one or more nodes in the cluster to be restarted (when multiple nodes are restarted in such an order that service is never lost, it is referred to as a rolling restart) and the time taken for these maintenance activities tend to be dominated by the restarting of data nodes. By increasing the amount of parallelism used (while also guarding against impacting the cluster by consuming too many resources) the data node restarts are 5x faster in MySQL Cluster 7.4.

The result to the use of such a significant speedup is twofold:

  • Less time spent by the administrator
  • More activities can be performed within a single maintenance window and full redundancy can be re-established much sooner

Note that the maintenance activities performed by MySQL Cluster Manager also benefit from these optimisations.

Enhanced Reporting

MySQL Cluster presents a lot of monitoring information through the ndbinfo database and in 7.4 we’ve added some extra information on how memory is used for individual tables and how operations are distributed.

Extra Memory Reporting

MySQL Cluster allocates all of the required memory when a data node starts and so any information on memory usage from the operating system is of limited use and provides no clues as to how memory is used within the data nodes – for example, which tables are using the most memory. Also, as this is a distributed database, it is helpful to understand whether a particular table is using a similar amount of memory in each data node (if not then it could be that a better partitioning/sharding key could be used). Finally, when rows are deleted from a table, the memory for those rows would typically remain allocated against that table and so it is helpful to understand how many of these empty slots are available for use by new rows in that table. MySQL Cluster 7.4 introduces a new table – ndbinfo.memory_per_fragment – that provides that information.

For example; to see how much memory is being used by each data node for a particular table…

mysql> CREATE DATABASE clusterdb;USE clusterdb;
mysql> CREATE TABLE simples (id INT NOT NULL AUTO_INCREMENT PRIMARY KEY) ENGINE=NDB;
mysql> SELECT node_id AS node, fragment_num AS frag, \
        fixed_elem_alloc_bytes alloc_bytes, \
        fixed_elem_free_bytes AS free_bytes, \
        fixed_elem_free_rows AS spare_rows \
        FROM ndbinfo.memory_per_fragment \
        WHERE fq_name LIKE '%simples%';
+------+------+-------------+------------+------------+
| node | frag | alloc_bytes | free_bytes | spare_rows |
+------+------+-------------+------------+------------+
|    1 |    0 |      131072 |       5504 |        172 |
|    1 |    2 |      131072 |       1280 |         40 |
|    2 |    0 |      131072 |       5504 |        172 |
|    2 |    2 |      131072 |       1280 |         40 |
|    3 |    1 |      131072 |       3104 |         97 |
|    3 |    3 |      131072 |       4256 |        133 |
|    4 |    1 |      131072 |       3104 |         97 |
|    4 |    3 |      131072 |       4256 |        133 |
+------+------+-------------+------------+------------+

When you delete rows from a MySQL Cluster table, the memory is not actually freed up and so if you check the existing memoryusage table you won’t see a change. This memory will be reused when you add new rows to that same table. In MySQL Cluster 7.4, it’s possible to see how much memory is in that state for a table…

mysql> SELECT node_id AS node, fragment_num AS frag, \
        fixed_elem_alloc_bytes alloc_bytes, \
        fixed_elem_free_bytes AS free_bytes, \
        fixed_elem_free_rows AS spare_rows \
        FROM ndbinfo.memory_per_fragment \
        WHERE fq_name LIKE '%simples%';
+------+------+-------------+------------+------------+
| node | frag | alloc_bytes | free_bytes | spare_rows |
+------+------+-------------+------------+------------+
|    1 |    0 |      131072 |       5504 |        172 |
|    1 |    2 |      131072 |       1280 |         40 |
|    2 |    0 |      131072 |       5504 |        172 |
|    2 |    2 |      131072 |       1280 |         40 |
|    3 |    1 |      131072 |       3104 |         97 |
|    3 |    3 |      131072 |       4256 |        133 |
|    4 |    1 |      131072 |       3104 |         97 |
|    4 |    3 |      131072 |       4256 |        133 |
+------+------+-------------+------------+------------+
mysql> DELETE FROM clusterdb.simples LIMIT 1;
mysql> SELECT node_id AS node, fragment_num AS frag, \
        fixed_elem_alloc_bytes alloc_bytes, \
        fixed_elem_free_bytes AS free_bytes, \
        fixed_elem_free_rows AS spare_rows \
        FROM ndbinfo.memory_per_fragment \
        WHERE fq_name LIKE '%simples%';
+------+------+-------------+------------+------------+
| node | frag | alloc_bytes | free_bytes | spare_rows |
+------+------+-------------+------------+------------+
|    1 |    0 |      131072 |       5504 |        172 |
|    1 |    2 |      131072 |       1312 |         41 |
|    2 |    0 |      131072 |       5504 |        172 |
|    2 |    2 |      131072 |       1312 |         41 |
|    3 |    1 |      131072 |       3104 |         97 |
|    3 |    3 |      131072 |       4288 |        134 |
|    4 |    1 |      131072 |       3104 |         97 |
|    4 |    3 |      131072 |       4288 |        134 |
+------+------+-------------+------------+------------+

As a final example, we can check whether a table is being evenly sharded accross the data nodes (in this case a realy bad sharding key was chosen)…

mysql> CREATE TABLE simples (id INT NOT NULL AUTO_INCREMENT, \
        species VARCHAR(20) DEFAULT "Human",
        PRIMARY KEY(id, species)) engine=ndb PARTITION BY KEY(species);

// Add some data

mysql> SELECT node_id AS node, fragment_num AS frag, \
        fixed_elem_alloc_bytes alloc_bytes, \
        fixed_elem_free_bytes AS free_bytes, \
        fixed_elem_free_rows AS spare_rows \
        FROM ndbinfo.memory_per_fragment \
        WHERE fq_name LIKE '%simples%';
+------+------+-------------+------------+------------+
| node | frag | alloc_bytes | free_bytes | spare_rows |
+------+------+-------------+------------+------------+
|    1 |    0 |           0 |          0 |          0 |
|    1 |    2 |      196608 |      11732 |        419 |
|    2 |    0 |           0 |          0 |          0 |
|    2 |    2 |      196608 |      11732 |        419 |
|    3 |    1 |           0 |          0 |          0 |
|    3 |    3 |           0 |          0 |          0 |
|    4 |    1 |           0 |          0 |          0 |
|    4 |    3 |           0 |          0 |          0 |
+------+------+-------------+------------+------------+

Extra Operations Reporting

To ensure that resources are being used effectively, it is very helpful to understand the each table is being access (how frequently and for what types of operations). To support this, the ndbinfo.operations_per_fragment table is provided. For example, the data in this table would let you identify that a large number of full table scans are performed on a particular table.

It is also important to identify if there are any hotspots where a disproportionate share of the queries for a table are hitting a particular fragment/data node. Again,ndbinfo.operations_per_fragment provides this information.

As an example of how to use some of the data from this table, a simple table is created and populated and then ndbinfo.operations_per_fragment is used to monitor how many Primary Key reads and table scans are performed:



mysql> CREATE TABLE simples (id INT AUTO_INCREMENT PRIMARY KEY, time TIMESTAMP) ENGINE=NDB;

mysql> SELECT fq_name AS 'Table', node_id AS 'Data Node', tot_key_reads AS 'Reads',
 tot_frag_scans AS 'Scans' FROM ndbinfo.operations_per_fragment WHERE fq_name LIKE '%simples';
+-----------------------+-----------+-------+-------+
| Table                 | Data Node | Reads | Scans |
+-----------------------+-----------+-------+-------+
| clusterdb/def/simples |         3 |     0 |     1 |
| clusterdb/def/simples |         3 |     0 |     0 |
| clusterdb/def/simples |         4 |     0 |     0 |
| clusterdb/def/simples |         4 |     0 |     1 |
+-----------------------+-----------+-------+-------+

mysql> INSERT INTO simples VALUES ();  # Repeated several times
mysql> SELECT * FROM simples;
+----+---------------------+
| id | time                |
+----+---------------------+
|  7 | 2015-01-22 15:12:42 |
|  8 | 2015-01-22 15:12:58 |
+----+---------------------+

mysql> SELECT fq_name AS 'Table', node_id AS 'Data Node', tot_key_reads AS 'Reads',
 tot_frag_scans AS 'Scans' FROM ndbinfo.operations_per_fragment WHERE fq_name LIKE '%simples';
+-----------------------+-----------+-------+-------+
| Table                 | Data Node | Reads | Scans |
+-----------------------+-----------+-------+-------+
| clusterdb/def/simples |         3 |     0 |     2 |
| clusterdb/def/simples |         3 |     0 |     0 |
| clusterdb/def/simples |         4 |     0 |     0 |
| clusterdb/def/simples |         4 |     0 |     2 |
+-----------------------+-----------+-------+-------+


mysql> SELECT * FROM simples WHERE id=11;
+----+---------------------+
| id | time                |
+----+---------------------+
| 11 | 2015-01-22 15:12:59 |
+----+---------------------+

mysql> SELECT fq_name AS 'Table', node_id AS 'Data Node', tot_key_reads AS 'Reads',
 tot_frag_scans AS 'Scans' FROM ndbinfo.operations_per_fragment WHERE fq_name LIKE '%simples';
+-----------------------+-----------+-------+-------+
| Table                 | Data Node | Reads | Scans |
+-----------------------+-----------+-------+-------+
| clusterdb/def/simples |         3 |     0 |     2 |
| clusterdb/def/simples |         3 |     0 |     0 |
| clusterdb/def/simples |         4 |     0 |     0 |
| clusterdb/def/simples |         4 |     1 |     2 |
+-----------------------+-----------+-------+-------+

Note that there are two rows listed for each data node but only one row for each has non-zero values; this is because each data node holds the primary fragment for one of the partitions and the secondary fragment for the other – all operations are performed only on the active fragments. This is made clearer if the fragment number is included in the query:

mysql> SELECT fq_name AS 'Table', node_id AS 'Data Node',
 fragment_num AS 'Fragment', tot_key_reads AS 'Reads', tot_frag_scans AS 'Scans'
 FROM ndbinfo.operations_per_fragment WHERE fq_name LIKE '%simples';
+-----------------------+-----------+----------+-------+-------+
| Table                 | Data Node | Fragment | Reads | Scans |
+-----------------------+-----------+----------+-------+-------+
| clusterdb/def/simples |         3 |        0 |     0 |     2 |
| clusterdb/def/simples |         3 |        1 |     0 |     0 |
| clusterdb/def/simples |         4 |        0 |     0 |     0 |
| clusterdb/def/simples |         4 |        1 |     1 |     2 |
+-----------------------+-----------+----------+-------+-------+

MySQL Cluster GUI-Based Auto-Installer

The Auto-Installer makes it simple for DevOps teams to quickly configure and provision highly optimized MySQL Cluster deployments. Developers can spend more time innovating in their code, rather than figuring out how to install, configure and start the database.

Implemented with a standard HTML GUI and Python-based web server back-end, the Auto-Installer intelligently configures MySQL Cluster based on application requirements and available hardware resources, stepping users through each stage of cluster creation:

  1. Workload Optimized: On launching the browser-based installer, users can specify the throughput, latency and write-load characteristics of their application
  2. Auto-Discovery: The Installer automatically discovers the underlying resources available from the local and remote servers that will make up the Cluster, including CPU architecture, cores and memory.

With these parameters, the installer creates optimized configuration files and starts the cluster.

Automated Tuning and Configuration of MySQL Cluster

[Source:- Dev.msql]