Google Cloud SQL provides easier MySQL for all

Google Cloud SQL aims to provide easier MySQL for all

With the general availability of Google Cloud Platform’s latest database offerings — the second generation of Cloud SQL, Cloud Bigtable, and Cloud Datastore — Google is setting up a cloud database strategy founded on a basic truth of software: Don’t get in the customer’s way.

For an example, look no further than the new iteration of Cloud SQL, a hosted version of MySQL for Google Cloud Platform. MySQL is broadly used by cloud applications, and Google is trying to keep it fuss-free — no small feat for any piece of software, let alone a database notorious in its needs for tweaks to work well.

Most of the automation around MySQL in Cloud SQL involves items that should be automated anyway, such as updates, automatic scaling to meet demand, autofailover between zones, and backup/roll-back functionality. This automation all comes via a recent version of MySQL, 5.7, not via an earlier version that’s been heavily customized by Google to support these features.

The other new offerings, Cloud Datastore and Cloud Bigtable, are fully managed incarnations of NoSQL and HBase/Hadoop systems. These systems have fewer users than MySQL, but are likely used to store gobs more data than with MySQL. One of MySQL 5.7’s new features, support for JSON data, provides NoSQL-like functionality for existing MySQL users. But users who are truly serious about NoSQL are likely to do that work on a platform designed to support it from the ground up.

The most obvious competition for Cloud SQL is Amazon’s Aurora service. When reviewed by InfoWorld’s Martin Heller in October 2015, it supported a recent version of MySQL (5.6) and had many of the same self-healing and self-maintaining features as Cloud SQL. Where Google has a potential edge is in the overall simplicity of its platform — a source of pride in other areas, such as a far less sprawling and complex selection of virtual machine types.

Another competitor is Snowflake, the cloud data warehousing solution designed to require little user configuration or maintenance. Snowflake’s main drawback is that it’s a custom-build database, even if it is designed to be highly compatible with SQL conventions. Cloud SQL, by contrast, is simply MySQL, a familiar product with well-understood behaviors.

 

 

 

[Source:- IW]

MySQL zero-day exploit puts some servers at risk of hacking

A zero-day exploit could be used to hack MySQL servers.

A publicly disclosed vulnerability in the MySQL database could allow attackers to completely compromise some servers.

The vulnerability affects “all MySQL servers in default configuration in all version branches (5.7, 5.6, and 5.5) including the latest versions,” as well as the MySQL-derived databases MariaDB and Percona DB, according to Dawid Golunski, the researcher who found it.

The flaw, tracked as CVE-2016-6662, can be exploited to modify the MySQL configuration file (my.cnf) and cause an attacker-controlled library to be executed with root privileges if the MySQL process is started with the mysqld_safe wrapper script.

The exploit can be executed if the attacker has an authenticated connection to the MySQL service, which is common in shared hosting environments, or through an SQL injection flaw, a common type of vulnerability in websites.

Golunski reported the vulnerability to the developers of all three affected database servers, but only MariaDB and Percona DB received patches so far. Oracle, which develops MySQL, was informed on Jul. 29, according to the researcher, but has yet to fix the flaw.

Oracle releases security updates based on a quarterly schedule and the next one is expected in October. However, since the MariaDB and Percona patches are public since the end of August, the researcher decided to release details about the vulnerability Monday so that MySQL admins can take actions to protect their servers.

Golunski’s advisory contains a limited proof-of-concept exploit, but some parts have been intentionally left out to prevent widespread abuse. The researcher also reported a second vulnerability to Oracle, CVE-2016-6663, that could further simplify the attack, but he hasn’t published details about it yet.

The disclosure of CVE-2016-6662 was met with some criticism on specialized discussion forums, where some users argued that it’s actually a privilege escalation vulnerability and not a remote code execution one as described, because an attacker would need some level of access to the database.

“As temporary mitigations, users should ensure that no mysql config files are owned by mysql user, and create root-owned dummy my.cnf files that are not in use,” Golunski said in his advisory. “These are by no means a complete solution and users should apply official vendor patches as soon as they become available.”

Oracle didn’t immediately respond to a request for comments on the vulnerability.

 

 

[Source:- IW]

MySQL zero-day exploit puts some servers at risk of hacking

A zero-day exploit could be used to hack MySQL servers.

A publicly disclosed vulnerability in the MySQL database could allow attackers to completely compromise some servers.

The vulnerability affects “all MySQL servers in default configuration in all version branches (5.7, 5.6, and 5.5) including the latest versions,” as well as the MySQL-derived databases MariaDB and Percona DB, according to Dawid Golunski, the researcher who found it.

The flaw, tracked as CVE-2016-6662, can be exploited to modify the MySQL configuration file (my.cnf) and cause an attacker-controlled library to be executed with root privileges if the MySQL process is started with the mysqld_safe wrapper script.

The exploit can be executed if the attacker has an authenticated connection to the MySQL service, which is common in shared hosting environments, or through an SQL injection flaw, a common type of vulnerability in websites.

Golunski reported the vulnerability to the developers of all three affected database servers, but only MariaDB and Percona DB received patches so far. Oracle, which develops MySQL, was informed on Jul. 29, according to the researcher, but has yet to fix the flaw.

Oracle releases security updates based on a quarterly schedule and the next one is expected in October. However, since the MariaDB and Percona patches are public since the end of August, the researcher decided to release details about the vulnerability Monday so that MySQL admins can take actions to protect their servers.

Golunski’s advisory contains a limited proof-of-concept exploit, but some parts have been intentionally left out to prevent widespread abuse. The researcher also reported a second vulnerability to Oracle, CVE-2016-6663, that could further simplify the attack, but he hasn’t published details about it yet.

The disclosure of CVE-2016-6662 was met with some criticism on specialized discussion forums, where some users argued that it’s actually a privilege escalation vulnerability and not a remote code execution one as described, because an attacker would need some level of access to the database.

“As temporary mitigations, users should ensure that no mysql config files are owned by mysql user, and create root-owned dummy my.cnf files that are not in use,” Golunski said in his advisory. “These are by no means a complete solution and users should apply official vendor patches as soon as they become available.”

Oracle didn’t immediately respond to a request for comments on the vulnerability.

 

 

[Source:- IW]

MySQL Cluster 7.3 GA: Increasing Developer Flexibility and Simplicity

Highlights: NoSQL JavaScript with node.js, Foreign Keys, and Auto-Tuned Clustering

The MySQL team at Oracle are excited to announce the General Availability of MySQL Cluster 7.3, ready for production workloads.

Some might call MySQL Cluster 7.3 “the foreign keys release” – and sure enough it is a major engineering achievement to build a distributed database that enforces referential integrity across a shared-nothing cluster, while maintaining ACID compliance and cross-shard JOINs. But MySQL Cluster 7.3 is so much more as well – especially if you are developing new JavaScript-based services along with node.

The design focus for MySQL Cluster 7.2 is enabling developer agility – making it simpler and faster than ever to build new services with a highly scalable, fault tolerant, real-time database The key enhancements delivered by MySQL Cluster 7.3 are summarized below.

MySQL Cluster 7.3, Faster & Easier Application DevelopmentFigure 1: MySQL Cluster 7.3, Faster & Easier Application Development

  • NoSQL JavaScript Connector for Node.js: Enables a single programming language and a single tool-chain by extending JavaScript from the client to the server, all the way through to the database, bypassing the SQL layer to deliver lower latency and reduced development cycles.
  • Foreign Keys: Strengthens data modeling and simplifies application logic by automatically enforcing referential integrity between different tables distributed on different shards, on different nodes … even in different data centers
  • MySQL 5.6 Support: Developers can combine the InnoDB and MySQL Cluster NDB storage engines within a single database, using the very latest MySQL 5.6 release.
  • Connection Thread Scalability: Increases cluster performance and capacity by improving the throughput of each connection to the data nodes, thus reducing the number of connections that need to be provisioned, and enabling greater scale-out headroom. Performance testing is showing up to 7.5x higher throughput per connection, enabling more client threads to use each connection.
  • Auto-Installer: Get it all up and running in minutes! Graphically configure and provision a production-grade cluster, automatically tuned for your workload and environment, without ever resorting to “RTFM”.

Lets take a closer look at MySQL Cluster 7.3. You can also get started by downloading the MySQL Cluster Evaluation Guide.

MySQL Cluster NoSQL JavaScript Connector for Node.js

Node.js is hot! In a little over 3 years, it has become one of the most popular environments for developing next generation web, cloud, mobile and social applications. Bringing JavaScript from the browser to the server, the design goal of Node.js is to build new real-time applications supporting millions of client connections, serviced by a single CPU core.

Making it simple to further extend the flexibility and power of Node.js to the database layer, the new JavaScript for Node.js Connector is part of MySQL Cluster 7.3.

With its non-blocking, event-driven asynchronous design, MySQL Cluster is the perfect architectural fit to node in building real-time, distributed services with tens of thousands of concurrent connections. Support for on-line schema changes enable these services to evolve rapidly, without downtime.

Implemented as a module for the V8 engine, the new driver provides node with a native, asynchronous JavaScript connector that can be used to both read and write results sets directly from MySQL Cluster, without transformations to SQL. The benefits are three-fold:

  1. Developers only need to use JavaScript to access the database, enabling rapid development cycles and faster time to market;
  2. The SQL layer is bypassed, delivering lower runtime latency and higher throughput for simple queries.
  3. As the client connects directly to the cluster rather than a MySQL layer, there is no need for any failover handling as this is all handled at the data node layer.

MySQL Cluster NoSQL Connector for Node enables end-to-end JavaScript development

Figure 2: MySQL Cluster NoSQL Connector for Node enables end-to-end JavaScript development

Rather than just presenting a simple interface to the database, the Node.js module integrates MySQL Cluster’s native API library directly within the web application itself, enabling developers to seamlessly couple their high performance, distributed applications with a high performance, distributed, database delivering 99.999% availability.

Developers can re-use JavaScript from the client to the server, all the way through to the database supporting real-time, high-scale services such as:

  • Process streaming data from digital advertising, real-time bidding and analytics systems;
  • Gaming and social networks, powering the back-end infrastructure for serving mobile devices.

As an added benefit, you can direct the connector to use SQL so that the same API can be used with InnoDB tables.

The JavaScript Connector for Node.js joins a growing portfolio of NoSQL interfaces for MySQL Cluster, which already include Memcached, Java, JPA and HTTP/REST. And of course, developers can still depend on SQL to execute complex queries and access the rich ecosystem of connectors, frameworks, tooling and skills.

Whichever interface is chosen for an application, SQL and NoSQL can be used concurrently across the same data set, providing the ultimate in developer flexibility. Therefore, MySQL Cluster may be supporting multiple applications, each with different query models and access patterns:

  • Relational queries using SQL;
  • Key/Value or Key/Object based web services using Node.js, Memcached or REST/HTTP;
  • Enterprise applications with the ClusterJ and JPA connectors;
  • Ultra low-latency services using the C++ NDB connector.

Learn more by reading the MySQL Cluster with node.js tutorial.

Foreign Keys

Foreign Key (FK) support has been one of the most requested enhancements to MySQL Cluster – bringing powerful new functionality while eliminating development complexity. FKs in MySQL Cluster enable new use-cases including:

  • Packaged applications such as eCommerce and Web Content Management or 3rd party middleware that depend on databases with Foreign Key support;
  • Custom projects requiring Foreign Key constraints to be implemented at the database layer.

Implementation

The definition and behaviour of FKs largely mirrors that of InnoDB, allowing developers to re-use existing MySQL skills in new projects.

FKs are enforced within the MySQL Cluster data nodes, allowing any client API accessing the cluster to benefit from them – whether they are SQL or one of the NoSQL interfaces (Memcached, C++, Java, JPA, HTTP/REST or the new JavaScript for Node.js API).

The core referential actions defined in the SQL:2003 standard are implemented:

  • CASCADE
  • RESTRICT
  • NO ACTION
  • SET NULL

In addition, the design supports online adding and dropping of Foreign Keys, enabling the database to continue serving client requests during DDL operations.

Configuration and Getting Started

There is nothing special you have to configure – FK constraint checking is enabled by default.

If you intend to migrate existing tables from another database or storage engine, for example from InnoDB, the DBA should drop FK constraints prior to the import process and then recreate them when complete.

MySQL Workbench can be used to view the relationships and FK constraints between tables, as demonstrated in the figure below. The engineering team are working on the ability to introduce constraints between existing tables within Workbench.

Viewing MySQL Cluster FK Constraints with MySQL Workbench

Figure 3: Viewing MySQL Cluster FK Constraints with MySQL Workbench

Learn more by reading this blog for a demonstration of using Foreign Keys with MySQL Cluster.

MySQL 5.6 Support

The SQL layer of MySQL Cluster 7.3 is based on the latest MySQL 5.6 GA release, enabling developers to take advantage of enhanced query throughput and replication robustness.

Enhanced Optimizer for Improved Query Throughput

The MySQL 5.6 Optimizer has been re-factored for better efficiency and performance, and provides an improved feature set for diagnostics. The key MySQL 5.6 optimizer improvements include:

  • Subquery Optimizations: Using semi-JOINs and materialization, the MySQL Optimizer delivers greatly improved subquery performance, simplifying how developers construct queries;
  • Multi-Range Reads: Improves query execution times by returning data more efficiently.
  • Better Optimizer Diagnostics: Enhanced EXPLAIN output and traces for tracking the optimizer decision-making process.

Cross-Data Center and Cross-Database Replication Flexibility

MySQL Cluster uses the MySQL Server’s replication for geographic distribution of clusters, enabling users to:

  • Mirror complete clusters across regions for disaster recovery;
  • Replicate data from the MySQL Cluster NDB storage engine to InnoDB or MyISAM slaves, typically for active archives or report generation.

MySQL 5.6 includes the broadest set of replication enhancements ever delivered in a single release, with key features available to MySQL Cluster 7.3 including:

  • Replication event checksums to detect and prevent corrupt data being replicated between clusters;
  • Crash-safe, transactional replication, providing self-healing recovery in the event of an outage in the replication channel between clusters;

It is worth noting geographic replication in MySQL Cluster is active/active (multi-master) – so two remote clusters can both service write requests.

For readers new to MySQL Cluster, Multi-Site Clustering can be used as an alternative to geographic replication when splitting a single cluster between data centers – though it should be noted that the datacenters should be connected via high speed, high quality WAN links.

This flexibility in cross-data center replication makes MySQL Cluster very popular for those applications that rely on the geographic distribution of data – for example PayPal’s fraud detection system is deployed on MySQL Cluster, spread across five Amazon Web Services regions.

Mixing and Matching with InnoDB

MySQL’s pluggable storage engine architecture enables developers to configure InnoDB or MySQL Cluster alongside each other in a single application, determined by the attributes and access patterns of each table.

It is not uncommon to find most tables configured to use InnoDB, while those that have the highest write loads, lowest latency or strictest availability requirements configured to use MySQL Cluster.

With support for MySQL 5.6 across both InnoDB and MySQL Cluster, developers can take advantage of the latest MySQL Server, whichever storage engine they are using.

Connection Thread Scalability

To fully exploit the distributed architecture of MySQL Cluster, users are advised to configure multiple connections between their MySQL Servers or API nodes to the data nodes. This allows MySQL Cluster to execute many more simultaneous operations in parallel.

Each of the connections to the data node layer consumes one of the 256 available node-ids, which in some scenarios can cap the scalability of the cluster.

MySQL Cluster 7.3 increases the throughput of each connection so that fewer connections (and therefore node-ids) are needed to service the same workload. Performance testing shows up to 7.5x higher throughput per connection depending on workload, enabling more client threads to use a single connection.

As a result of Connection Thread Scalability, more nodes can be added to the Cluster to further scale capacity and performance without hitting the 256 node-id limit.

MySQL Cluster GUI-Based Auto-Installer

The Auto-Installer makes it simple for DevOps teams to quickly configure and provision highly optimized MySQL Cluster deployments. Developers can spend more time innovating in their code, rather than figuring out how to install, configure and start the database.

Implemented with a standard HTML GUI and Python-based web server back-end, the Auto-Installer intelligently configures MySQL Cluster based on application requirements and available hardware resources, stepping users through each stage of cluster creation:

  1. Workload Optimized: On launching the browser-based installer, users can specify the throughput, latency and write-load characteristics of their application;
  2. Auto-Discovery: The Installer automatically discovers the underlying resources available from the local and remote servers that will make up the Cluster, including CPU architecture, cores and memory.

With these parameters, the installer creates optimized configuration files and starts the cluster.

Automated Tuning and Configuration of MySQL Cluster

Figure 4: Automated Tuning and Configuration of MySQL Cluster

The user remains in complete control of the installation:

  • Individual configuration parameters can be modified by the user;
  • The user may override the topology defined by the installer, specifying which hosts run each of the cluster processes;
  • The Cluster can be remotely started and stopped from a single browser window.

Developed by the same engineering team responsible for the development of the MySQL Cluster database, the installer provides standardized configurations that make it simple, quick and easy to build stable and high performance clustered environments.

DevOps maintains complete control over Cluster configuration and deployment

Figure 5: DevOps maintains complete control over Cluster configuration and deployment

 

[Source:- Dev.msql]

MySQL Fabric GA – Adding High Availability and/or Scaling to MySQL

Abstract

MySQL Fabric provides a simple way to manage a collection of MySQL Servers and ensure that transactions and queries are routed to the correct server. We’re pleased to announce that MySQL Fabric is now Generally Available and ready for live deployment! This article explains what MySQL Fabric is and how to set it up. It also gives an example of how it can be used to introduce High Availability (including automated failure detection and transparent failover) and/or scale-out (using data partitioning/sharding).

1. Introduction

MySQL Fabric Introduction

MySQL is famous for being a very easy to use database and with the InnoDB storage engine it delivers great performance, functionality and reliability.

MySQL/InnoDB now scales-up extremely well as you add more cores to the server and this continues to improve with each release but at some point a limit is reached where scaling up is no longer enough. It could be you’re already using the largest available machine or it’s just more economical to use multiple, commodity servers. Scaling out reads is a simple matter using MySQL Replication – have one master MySQL Server to handle all writes and then load balance reads across as many slave MySQL Servers as you need. What happens when that single master fails though? High Availability (HA) also goes beyond coping with failures – with always connected, mobile apps and global services, the concept of a “maintenance window” where system downtime can be scheduled is a thing of the past for most applications.

It’s traditionally been the job of the application or the DBA to detect the failure and promote one of the slaves to be the new master. Making this whole system Highly Available can become quite complex and diverts development and operational resources away from higher-value, revenue generating tasks.

MySQL Fabric - HA - Scaling

While MySQL Replication provides the mechanism to scale out reads, a single server must handle all of the writes and as modern applications become more and more interactive the proportion of writes is forever increasing. The ubiquity of social media means that the age of the publish once and read a billions times web site is over. Add to this the promise offered by Cloud platforms – massive, elastic scaling out of the underlying infrastructure – and you get a huge demand for scaling out to dozens, hundreds or even thousands of servers.

The most common way to scale out is by sharding the data between multiple MySQL Servers; this can be done vertically (each server holding a discrete subset of the tables – say those for a specific set of features) or horizontally where each server holds a subset of the rows for a given table. While effective, sharding has required developers and DBAs to invest a lot of effort in building and maintaining complex logic at the application and management layers – once more, detracting from higher value activities.

The introduction of MySQL Fabric makes all of this far simpler. MySQL Fabric is designed to manage pools of MySQL Servers – whether just a pair for High Availability or many thousand to cope with scaling out huge web application.

For High Availability, MySQL Fabric manages the replication relationships, detects the failure of the master and automatically promotes one of the slaves to be the new master. This is all completely transparent to the application.

For scaling, MySQL Fabric automates sharding with the connectors routing requests to the server (or servers if also using MySQL Fabric for High Availability) based on a sharding key provided by the application. If one shard gets too big then MySQL Fabric can split the shard while ensuring that requests continue to be delivered to the correct location.

MySQL Fabric provides a simple and effective option for High Availability as well as the option of massive, incremental scale-out. It does this without sacrificing the robustness of MySQL and InnoDB – with the resulting need for an application rewrite and needing your Dev Ops teams to move to unfamiliar technologies or abandon their favorite tools.

This article goes into MySQL Fabric’s capabilities in more depth and then goes on to provide a worked example of using it – initially to provide High Availability and then adding sharding.

2. What MySQL Fabric Provides

MySQL Fabric is built around an extensible framework for managing farms of MySQL Servers. Currently two features have been implemented – High Availability and scaling out using data sharding. These features can be used in isolation or in combination.

Both features are implemented in two layers:

  • The mysqlfabric process which processes any management requests – whether received through the mysqlfabric command-line-interface (documented in the reference manual) or from another process via the supplied XML/RPC interface. When using the HA feature, this process can also be made responsible for monitoring the master server and initiating failover to promote a slave to be the new master should it fail. The state of the server farm is held in the state store (a MySQL database) and the mysqlfabric process is responsible for providing the stored routing information to the connectors.
  • MySQL Connectors are used by the application code to access the database(s), converting instructions from a specific programming language to the MySQL wire protocol, which is used to communicate with the MySQL Server processes. A ‘Fabric-aware’ connector stores a cache of the routing information that it has received from the mysqlfabric process and then uses that information to send transactions or queries to the correct MySQL Server. Currently the three supported Fabric-aware MySQL connectors are for PHP, Python and Java (and in turn the Doctrine and Hibernate Object-Relational Mapping frameworks). This approach means that the latency and potential bottleneck of sending all requests via a proxy can be avoided.

2.1 High Availability

High Availability (HA) refers to the ability for a system to provide continuous service – a system is available while that service can be utilized. The level of availability is often expressed in terms of the “number of nines” – for example, a HA level of 99.999% means that the service can be used for 99.999% of the time, in other words, on average, the service is only unavailable for 5.25 minutes per year (and that includes all scheduled as well as unscheduled down-time).

2.1.1 Different Points of High Availability

Figure 1 shows the different layers in the system that need to be available for service to be provided.

Layered High Availability

Figure 1: Layered High Availability

At the bottom is the data that the service relies on. Obviously, if that data is lost then the service cannot function correctly and so it’s important to make sure that there is at least one extra copy of that data. This data can be duplicated at the storage layer itself but with MySQL it’s most commonly replicated by the layer above – the MySQL Server using MySQL Replication. The MySQL Server provides access to the data – there is no point in the data being there if you can’t get at it! It is a common misconception that having redundancy at these two levels is enough to have a HA system but it is also necessary to look at the system from the top-down.

To have a HA service, there needs to be redundancy at the application layer; in itself this is very straight-forward, just load balance all of the service requests over a pool of application servers which are all running the same application logic. If the service were something as simple as a random number generator then this would be fine but most useful applications need to access data and as soon as you move beyond a single database server (for example because it needs to be HA) then a way is needed to connect the application server to the correct data source. In a HA system, the routing isn’t a static function, if one database server should fail (or be taken down for maintenance) the application should be directed instead to an alternate database. Some HA systems implement this routing function by introducing a proxy process between the application and the database servers; others use a virtual IP address which can be migrated to the correct server. When using MySQL Fabric, this routing function is implemented within the Fabric-aware MySQL connector library that’s used by the application server processes.

2.1.2 What MySQL Fabric Adds in Terms of High Availability

MySQL Fabric has the concept of a HA group which is a pool of two or more MySQL Servers; at any point in time, one of those servers is the Primary (MySQL Replication master) and the others are Secondaries (MySQL Replication slaves). The role of a HA group is to ensure that access to the data held within that group is always available.

MySQL Fabric Implementing HA

Figure 2: MySQL Fabric Implementing HA

While MySQL Replication allows the data to be made safe by duplicating it, for a HA solution two extra components are needed and MySQL Fabric provides these:

  • Failure detection and promotion – the MySQL Fabric process monitors the Primary within the HA group and should that server fail then it selects one of the Secondaries and promotes it to be the Primary (with all of the other slaves in the HA group then receiving updates from the new master). Note that the connectors can inform MySQL Fabric when they observe a problem with the Primary and the MySQL Fabric process uses that information as part of its decision making process surrounding the state of the servers in the farm.
  • Routing of database requests – When MySQL Fabric promotes the new Primary, it updates the state store and notifies the connectors so that they can refresh their caches with the updated routing information. In this way, the application does not need to be aware that the topology has changed and that writes need to be sent to a different destination.

3. Scaling Out – Sharding

When nearing the capacity or write performance limit of a single MySQL Server (or HA group), MySQL Fabric can be used to scale-out the database servers by partitioning the data across multiple MySQL Server “groups”. Note that a group could contain a single MySQL Server or it could be a HA group.

MySQL Fabric Implementing HA & Sharding

Figure 3: MySQL Fabric Implementing HA & Sharding

The administrator defines how data should be partitioned/sharded between these servers; this is done by creating shard mappings. A shard mapping applies to a set of tables and for each table the administrator specifies which column from those tables should be used as a shard key (the shard key will subsequently be used by MySQL Fabric to calculate which shard a specific row from one of those tables should be part of). Because all of these tables use the same shard key and mapping, the use of the same column value in those tables will result in those rows being in the same shard – allowing a single transaction to access all of them. For example, if using the subscriber-id column from multiple tables then all of the data for a specific subscriber will be in the same shard. The administrator then defines how that shard key should be used to calculate the shard number:

  • HASH – A hash function is run on the shard key to generate the shard number. If values held in the column used as the sharding key don’t tend to have too many repeated values then this should result in an even partitioning of rows across the shards.
  • RANGE – The administrator defines an explicit mapping between ranges of values for the sharding key and shards. This gives maximum control to the user of how data is partitioned and which rows should be co-located.

When the application needs to access the sharded database, it sets a property for the connection that specifies the sharding key – the Fabric-aware connector will then apply the correct range or hash mapping and route the transaction to the correct shard.

If further shards/groups are needed then MySQL Fabric can split an existing shard into two and then update the state-store and the caches of routing data held by the connectors. Similarly, a shard can be moved from one HA group to another.

Note that a single transaction or query can only access a single shard and so it is important to select shard keys based on an understanding of the data and the application’s access patterns. It doesn’t always make sense to shard all tables as some may be relatively small and having their full contents available in each group can be beneficial given the rule about no cross-shard queries. These global tables are written to a ‘global group’ and any additions or changes to data in those tables are automatically replicated to all of the other groups. Schema changes are also made to the global group and replicated to all of the others to ensure consistency.

To get the best mapping, it may also be necessary to modify the schema if there isn’t already a ‘natural choice’ for the sharding keys.

4. Current Limitations

The initial version of MySQL Fabric is designed to be simple, robust and able to scale to thousands of MySQL Servers. This approach means that this version has a number of limitations, which are described here:

  • Sharding is not completely transparent to the application. While the application need not be aware of which server stores a set of rows and it doesn’t need to be concerned when that data is moved, it does need to provide the sharding key when accessing the database.
  • Auto-increment columns cannot be used as a sharding key
  • All transactions and queries need to be limited in scope to the rows held in a single shard, together with the global (non-sharded) tables. For example, Joins involving multiple shards are not supported
  • Because the connectors perform the routing function, the extra latency involved in proxy-based solutions is avoided but it does mean that Fabric-aware connectors are required – at the time of writing these exist for PHP, Python and Java
  • The MySQL Fabric process itself is not fault-tolerant and must be restarted in the event of it failing. Note that this does not represent a single-point-of-failure for the server farm (HA and/or sharding) as the connectors are able to continue routing operations using their local caches while the MySQL Fabric process is unavailable

5. Architected for Extensibility

MySQL Fabric has been architected for extensibility at a number of levels. For example, in the first release the only option for implementing HA is based on MySQL Replication but in future releases we hope to add further options (for example, MySQL Cluster). We also hope to see completely new applications around the managing of farms of MySQL Servers – both from Oracle and the wider MySQL community.

Figure 4 illustrates how new applications and protocols can be added using the pluggable framework.

MySQL Fabric's Extensible Architecture

Figure 4: MySQL Fabric’s Extensible Architecture

6. Examples

This section focuses on how to actually use MySQL Fabric – initially to provide High Availability and then to augment that by scaling out using sharding. The focus will be on some of the management tasks as well as what changes are needed in the application (using Python code examples). For brevity, this article doesn’t provide a full walkthrough; a complete end-to-end walkthrough (including configuring and running all of the MySQL Servers) can be found in the MySQL Fabric – adding High Availability and Scaling to MySQL article.

6.1 Adding High Availability

The intent of this section is to introduce MySQL Fabric to add High Availability to MySQL. Figure 5 illustrates the configuration that will be created.

Figure 5

Figure 5: Single HA Group

There will be a single HA group that has the name group_id-1 that will contain three MySQL Servers – each running on a different machine (fab2, fab3 and fab4) – and at any point in time, one of those MySQL Servers will be the Primary (master) and the others will be Secondaries. Should the Primary fail, one of the Secondaries would be automatically promoted by the MySQL Fabric process to be the new Primary.

The MySQL Fabric process itself will run on a fourth machine (fab1) together with its state store (another MySQL Server) and the test application which uses the Fabric-aware Python connector.

The assumption is made that the state store (a MySQL Server is already up and running and the fabric user has been created).

The fabric schema within the state store can now be created:


[[email protected] ~]# mysqlfabric manage setup --param=storage.user=fabric
[INFO] 1399476439.536728 - MainThread - Initializing persister: user \
        (fabric), server (localhost:3306), database (fabric).
[INFO] 1399476451.330008 - MainThread - Initial password for admin/xmlrpc \
        set
Password set for admin/xmlrpc from configuration file.
[INFO] 1399476451.333563 - MainThread - Password set for admin/xmlrpc \
        from configuration file.

The MySQL Fabric process can now be started (note that the Fabric process can be run as a daemon by adding the –daemonize option:


[[email protected] ~]$ mysqlfabric manage start

Initially, there will be a single HA group (this is all that is required for HA – later, additional groups will be added to enable scaling out through partitioning/sharding of the data):


[[email protected] ~]$ mysqlfabric group create group_id-1
Procedure :
{ uuid        = 7e0c90ec-f81f-4ff6-80d3-ae4a8e533979,
  finished    = True,
  success    = True,
  return      = True,
  activities  =
}

The MySQL Servers that will form farm can now be added to the HA group.


[[email protected] ~]$ mysqlfabric group add group_id-1 \
    192.168.56.102:3306
Procedure :
{ uuid        = 073f421a-9559-4413-98fd-b839131ea026,
  finished    = True,
  success    = True,
  return      = True,
  activities  =
}
[[email protected] ~]$ mysqlfabric group add group_id-1 \
        192.168.56.103:3306
Procedure :
{ uuid        = b0f5b04a-27e6-46ce-adff-bf1c046829f7,
  finished    = True,
  success    = True,
  return      = True,
  activities  =
}
[[email protected] ~]$ mysqlfabric group add group_id-1 \
        192.168.56.104:3306
Procedure :
{ uuid        = 520d1a7d-1824-4678-bbe4-002d0bae5aaa,
  finished    = True,
  success    = True,
  return      = True,
  activities  =
}

Note that all of the MySQL Servers are currently Secondaries (in other words, none of them is acting as the MySQL Replication master). The next step is to promote one of the servers to be the Primary; in this case the uuid of the server we want to promote is provided but that isn’t required – in which case MySQL Fabric will select one.


[[email protected] ~]$ mysqlfabric group promote group_id-1
Procedure :
{ uuid        = c875371b-890c-49ff-b0a5-6bbc38be7097,
  finished    = True,
  success    = True,
  return      = True,
  activities  =
}
[[email protected] myfab]$ mysqlfabric group lookup_servers group_id-1
Command :
{ success    = True
  return      = [
        {'status': 'PRIMARY', 'server_uuid': '00f9831f-d602-11e3-b65e-0800271119cb', \
                'mode': 'READ_WRITE', 'weight': 1.0, 'address': '192.168.56.104:3306'}, \
        {'status': 'SECONDARY', 'server_uuid': 'f6fe224e-d601-11e3-b65d-0800275185c2', \
                'mode': 'READ_ONLY', 'weight': 1.0, 'address': '192.168.56.102:3306'}, \
        {'status': 'SECONDARY', 'server_uuid': 'fbb5c440-d601-11e3-b65d-0800278bafa8', \
                'mode': 'READ_ONLY', 'weight': 1.0, 'address': '192.168.56.103:3306'}]
  activities  =
}

At this stage, the MySQL replication relationship is configured and running but there isn’t yet High Availability as MySQL Fabric is not monitoring the state of the servers – the final configuration step fixes that:


[[email protected] ~]$ mysqlfabric group activate group_id-1
Procedure :
{ uuid        = 40a5e023-06ba-4e1e-93de-4d4195f87851,
  finished    = True,
  success    = True,
  return      = True,
  activities  =
}

Everything is now set up to detect if the Primary (master) should fail and in the event that it does, promote one of the Secondaries to be the new Primary. If using one of the MySQL Fabric-aware connectors (initially PHP, Python and Java) then that failover can be transparent to the application.

The code that follows shows how an application can accesses this new HA group – in this case, using the Python connector. First an application table is created:


[[email protected] ~]$ cat setup_table_ha.py
import mysql.connector
from mysql.connector import fabric

conn = mysql.connector.connect(
    fabric={"host" : "localhost", "port" : 32274, "username": "admin", \
        "password" : "admin"},
    user="root", database="test", password="",
    autocommit=True
)

conn.set_property(mode=fabric.MODE_READWRITE, group="group_id-1")
cur = conn.cursor()
cur.execute(
"CREATE TABLE IF NOT EXISTS subscribers ("
"  sub_no INT, "
"  first_name CHAR(40), "
"  last_name CHAR(40)"
")"
)

Note the following about that code sample:

  • The connector is provided with the address for the MySQL Fabric process localhost:32274 rather than any of the MySQL Servers
  • The mode property for the connection is set to fabric.MODE_READWRITEwhich the connector will interpret as meaning that the transaction should be sent to the Primary (as that’s where all writes must be executed so that they can be replicated to the Secondaries)
  • The group property is set to group_id-1 which is the name that was given to the single HA Group

This code can now be executed and then a check made on one of the Secondaries that the table creation has indeed been replicated from the Primary.


[[email protected] myfab]$ python setup_table_ha.py
[[email protected] myfab]$ mysql -h 192.168.56.103 -P3306 -u root -e "use test;show tables;"
+----------------+
| Tables_in_test |
+----------------+
| subscribers    |
+----------------+

The next step is to add some rows to the table:


[[email protected] ~]$ cat add_subs_ha.py
import mysql.connector
from mysql.connector import fabric

def add_subscriber(conn, sub_no, first_name, last_name):
    conn.set_property(group="group_id-1", mode=fabric.MODE_READWRITE)
    cur = conn.cursor()
    cur.execute(
        "INSERT INTO subscribers VALUES (%s, %s, %s)",
        (sub_no, first_name, last_name)
        )

conn = mysql.connector.connect(
    fabric={"host" : "localhost", "port" : 32274, "username": "admin", \
         "password" : "admin"},
    user="root", database="test", password="",
    autocommit=True
    )

conn.set_property(group="group_id-1", mode=fabric.MODE_READWRITE)

add_subscriber(conn, 72, "Billy", "Fish")
add_subscriber(conn, 500, "Billy", "Joel")
add_subscriber(conn, 1500, "Arthur", "Askey")
add_subscriber(conn, 5000, "Billy", "Fish")
add_subscriber(conn, 15000, "Jimmy", "White")
add_subscriber(conn, 17542, "Bobby", "Ball")
[[email protected] ~]$ python add_subs_ha.py
[[email protected] myfab]$ mysql -h 192.168.56.103 -P3306 -u root \
    -e "select * from test.subscribers"
+--------+------------+-----------+
| sub_no | first_name | last_name |
+--------+------------+-----------+
|     72 | Billy      | Fish      |
|    500 | Billy      | Joel      |
|   1500 | Arthur     | Askey     |
|   5000 | Billy      | Fish      |
|  15000 | Jimmy      | White     |
|  17542 | Bobby      | Ball      |
+--------+------------+-----------+

And then the data can be retrieved (note that the mode parameter for the connection is set to fabric.MODE_READONLY and so the connector knows that it can load balance the requests across any MySQL Servers in the HA Group).


[email protected] ~]$ cat read_table_ha.py
import mysql.connector
from mysql.connector import fabric

def find_subscriber(conn, sub_no):
    conn.set_property(group="group_id-1", mode=fabric.MODE_READONLY)
    cur = conn.cursor()
    cur.execute(
        "SELECT first_name, last_name FROM subscribers "
        "WHERE sub_no = %s", (sub_no, )
        )
    for row in cur:
        print row

conn = mysql.connector.connect(
    fabric={"host" : "localhost", "port" : 32274, "username": "admin", \
        "password" : "admin"},
    user="root", database="test", password="",
    autocommit=True
    )

find_subscriber(conn, 72)
find_subscriber(conn, 500)
find_subscriber(conn, 1500)
find_subscriber(conn, 5000)
find_subscriber(conn, 15000)
find_subscriber(conn, 17542)

[[email protected] ~]$ python read_table_ha.py
(u'Billy', u'Fish')
(u'Billy', u'Joel')
(u'Arthur', u'Askey')
(u'Billy', u'Fish')
(u'Jimmy', u'White')
(u'Bobby', u'Ball')

Note that if the Secondary servers don’t all have the same performance then you can skew the ratio for how many reads are sent to each one using the mysqlfabric server set_weight command – specifying a value between 0 and 1 (default is 1 for all servers). Additionally, the mysqlfabric server set_mode command can be used to specify if the Primary should receive some of the reads (READ_WRITE) or only writes (WRITE_ONLY).

The next section describes how this configuration can be extended to add scalability by sharding the table data (and it can be skipped if that isn’t needed).

6.2 Adding Scale-Out with Sharding

The example in this section builds upon the previous one by adding more servers in order to scale out the capacity and read/write performance of the database. The first step is to create a new group (which is named global-group in this example) – the Global Group is a special HA group that performs two critical functions:

  • Any data schema changes are applied to the Global Group and from there they will be replicated to each of the other HA Groups
  • If there are tables that contain data that should be replicated to all HA groups (rather than sharded) then any inserts, updates or deletes will be made on the Global Group and then replicated to the others. Those tables are referred to as global tables.

Figure 6 illustrates what the configuration will look like once the Global Group has been created.

Figure 6

Figure 6: Addition of Global Group

The Global Group is defined and populated with MySQL Servers and then a Primary is promoted in the following steps:


[[email protected]]$ mysqlfabric group create global-group
Procedure :
{ uuid        = 5f07e324-ec0a-42b4-98d0-46112f607143,
  finished    = True,
  success    = True,
  return      = True,
  activities  =
}

[[email protected] ~]$ mysqlfabric group add global-group \
    192.168.56.102:3316
Procedure :
{ uuid        = ccf699f5-ba2c-4400-a8a6-f951e10d4315,
  finished    = True,
  success    = True,
  return      = True,
  activities  =
}
[[email protected] ~]$ mysqlfabric group add global-group \
    192.168.56.102:3317
Procedure :
{ uuid        = 7c476dda-3985-442a-b94d-4b9e650e5dfe,
  finished    = True,
  success    = True,
  return      = True,
  activities  =
}
[[email protected] ~]$ mysqlfabric group add global-group \
    192.168.56.102:3318
Procedure :
{ uuid        = 476fadd4-ca4f-49b3-a633-25dbe0ffdd11,
  finished    = True,
  success    = True,
  return      = True,
  activities  =
}

[[email protected] ~]$ mysqlfabric group promote global-group
Procedure :
{ uuid        = e818708e-6e5e-4b90-aff1-79b0b2492c75,
  finished    = True,
  success    = True,
  return      = True,
  activities  =
}
[[email protected] ~]$ mysqlfabric group lookup_servers global-group
Command :
{ success    = True
  return      = [
        {'status': 'PRIMARY', 'server_uuid': '56a08135-d60b-11e3-b69a-0800275185c2',\
                'mode': 'READ_WRITE', 'weight': 1.0, 'address': '192.168.56.102:3316'}, \
        {'status': 'SECONDARY', 'server_uuid': '5d5f5cf6-d60b-11e3-b69b-0800275185c2', \
                'mode': 'READ_ONLY', 'weight': 1.0, 'address': '192.168.56.102:3317'}, \
        {'status': 'SECONDARY', 'server_uuid': '630616f4-d60b-11e3-b69b-0800275185c2', \
                'mode': 'READ_ONLY', 'weight': 1.0, 'address': '192.168.56.102:3318'}]
  activities  =
}

As an application table has already been created within the original HA group, that will need to copied to the new Global Group:


[email protected] myfab]$ mysqldump -d -u root --single-transaction -h 192.168.56.102 -P3306 \
--all-databases > my-schema.sql
[[email protected] myfab]$ mysql -h 192.168.56.102 -P3317 -u root -e 'reset master'
[[email protected] myfab]$ mysql -h 192.168.56.102 -P3317 -u root < my-schema.sql

A shard mapping is an entity that is used to define how certain tables should be sharded between a set of HA groups. It is possible to have multiple shard mappings but in this example, only one will be used. When defining the shard mapping, there are two key parameters:

  • The type of mapping – can be either HASH or RANGE
  • The global group that will be used

The commands that follow define the mapping and identify the index number assigned to this mapping (in this example – 3):


[[email protected] ~]$ mysqlfabric sharding create_definition HASH global-group
Procedure :
{ uuid        = 78ea7209-b073-4d03-9d8b-bda92cc76f32,
  finished    = True,
  success    = True,
  return      = 1,
  activities  =
}

[[email protected]]$ mysql -h 127.0.0.1 -P3306 -u root -e 'select * from fabric.shard_maps'

+------------------+-----------+--------------+
| shard_mapping_id | type_name | global_group |
+------------------+-----------+--------------+
|                1 | HASH      | global-group |
+------------------+-----------+--------------+

The next step is to define what columns from which tables should be used as the sharding key (the value on which the HASH function is executed or is compared with the defined RANGEs). In this example, only one table is being sharded (the subscribers table with the sub_no column being used as the sharding key) but the command can simply be re-executed for further tables. Note that the identifier for the shard mapping (3) is passed on the command-line:


[[email protected] ~]$ mysqlfabric sharding add_table 1 test.subscribers sub_no
Procedure :
{ uuid        = 446aadd1-ffa6-4d19-8d52-4683f3d7c998,
  finished    = True,
  success    = True,
  return      = True,
  activities  =
}

At this point, the shard mapping has been defined but no shards have been created and so the next step is to create a single shard and that shard will be stored in the existing HA group group_id-1):


[[email protected] ~]$ mysqlfabric sharding add_shard 1 group_id-1 --state=enabled
Procedure :
{ uuid        = 4efc038c-cd18-448d-be32-ca797c4c006f,
  finished    = True,
  success    = True,
  return      = True,
  activities  =
}

[[email protected]]$ mysql -h 127.0.0.1 -P3306 -u root \
    -e 'select * from fabric.shards'
+----------+------------+---------+
| shard_id | group_id  | state  |
+----------+------------+---------+
|        1 | group_id-1 | ENABLED |
+----------+------------+---------+

At this point, the database has technically been sharded but of course it offers no scalability as there is only a single shard. The steps that follow evolve that configuration into one containing two shards as shown in Figure 7.

Figure 7

Figure 7: HA MySQL Fabric Server Farm

Another HA group (group_id-2) is created, the three new servers added to it and then one of the servers is promoted to be the Primary:


[[email protected] ~]$ mysqlfabric group add group_id-2 192.168.56.105:3306
Procedure :
{ uuid        = fe679280-81ed-436c-9b7f-3d6f46987492,
  finished    = True,
  success    = True,
  return      = True,
  activities  =
}
[[email protected] ~]$ mysqlfabric group add group_id-2 192.168.56.106:3306
Procedure :
{ uuid        = 6fcf7e0c-c092-4d81-9898-448abf2b113c,
  finished    = True,
  success    = True,
  return      = True,
  activities  =
}
[[email protected] ~]$ mysqlfabric group add group_id-2 192.168.56.107:3306
Procedure :
{ uuid        = 8e9d4fbb-58ef-470d-81eb-8d92813427ae,
  finished    = True,
  success    = True,
  return      = True,
  activities  =
}
[[email protected] ~]$ mysqlfabric group promote group_id-2
Procedure :
{ uuid        = 21569d7f-93a3-4bdc-b22b-2125e9b75fca,
  finished    = True,
  success    = True,
  return      = True,
  activities  =
}

At this point, the new HA group exists but is missing the application schema and data. Before allocating a shard to the group, a reset master needs to be executed on the Primary for the group (this is required because changes have already been made on that server – if nothing else, to grant permissions for one or more users to connect remotely). The mysqlfabric group lookup_servers command is used to first check which of the three servers is currently the Primary.


[[email protected] ~]$ mysqlfabric group lookup_servers group_id-2
Command :
{ success    = True
  return      = [
        {'status': 'PRIMARY', 'server_uuid': '10b086b5-d617-11e3-b6e7-08002767aedd', \
                'mode': 'READ_WRITE', 'weight': 1.0, 'address': '192.168.56.105:3306'}, \
        {'status': 'SECONDARY', 'server_uuid': '5dc81563-d617-11e3-b6e9-08002717142f', \
                'mode': 'READ_ONLY', 'weight': 1.0, 'address': '192.168.56.106:3306'}, \
        {'status': 'SECONDARY', 'server_uuid': '83cae7b2-d617-11e3-b6ea-08002763b127', 'mode': 'READ_ONLY', 'weight': 1.0, 'address': '192.168.56.107:3306'}]
  activities  =
}

[[email protected] myfab]$ mysql -h 192.168.56.105 -P3306 -uroot -e 'reset master'

The next step is to split the existing shard, specifying the shard id (in this case 2) and the name of the HA group where the new shard will be store:


[[email protected] ~]$ mysqlfabric sharding split_shard 1 group_id-2
Procedure :
{ uuid        = 4c559f6c-0b08-4a57-b095-364755636b7b,
  finished    = True,
  success    = True,
  return      = True,
  activities  =
}

Before looking at the application code changes that are needed to cope with the sharded data, a simple test can be run to confirm that the table’s existing data has indeed been split between the two shards:


[[email protected]]$ mysql -h 192.168.56.102 -P3306 -uroot -e 'select * from test.subscribers'
+--------+------------+-----------+
| sub_no | first_name | last_name |
+--------+------------+-----------+
|    500 | Billy      | Joel      |
|   1500 | Arthur     | Askey     |
|   5000 | Billy      | Fish      |
|  17542 | Bobby      | Ball      |
+--------+------------+-----------+

[[email protected]]$ mysql -h 192.168.56.107 -P3306 -uroot -e 'select * from test.subscribers'
+--------+------------+-----------+
| sub_no | first_name | last_name |
+--------+------------+-----------+
|     72 | Billy      | Fish      |
|  15000 | Jimmy      | White     |
+--------+------------+-----------+

The next example Python code adds some new rows to the subscribers table. Note that the tables property for the connection is set to test.subscribers and the key to the value of the sub_no column for that table – this is enough information for the Fabric-aware connector to choose the correct shard/HA group and then the fact that the mode property is set to fabric.MODE_READWRITE further tells the connector that the transaction should be sent to the Primary within that HA group.


[[email protected] myfab]$ cat add_subs_shards2.py
import mysql.connector
from mysql.connector import fabric

def add_subscriber(conn, sub_no, first_name, last_name):
    conn.set_property(tables=["test.subscribers"], key=sub_no, \
        mode=fabric.MODE_READWRITE)
    cur = conn.cursor()
    cur.execute(
        "INSERT INTO subscribers VALUES (%s, %s, %s)",
        (sub_no, first_name, last_name)
        )

conn = mysql.connector.connect(
    fabric={"host" : "localhost", "port" : 32274, "username": "admin", \
        "password" : "admin"},
    user="root", database="test", password="",
    autocommit=True
)

conn.set_property(tables=["test.subscribers"], scope=fabric.SCOPE_LOCAL)

add_subscriber(conn, 22, "Billy", "Bob")
add_subscriber(conn, 8372, "Banana", "Man")
add_subscriber(conn, 93846, "Bill", "Ben")
add_subscriber(conn, 5006, "Andy", "Pandy")
add_subscriber(conn, 15050, "John", "Smith")
add_subscriber(conn, 83467, "Tommy", "Cannon")

[[email protected] myfab]$ python add_subs_shard2.py

The mysql client can then be used to confirm that the new data has also been partitioned between the two shards/HA groups.


[[email protected] myfab]$ mysql -h 192.168.56.103 -P3306 -uroot -e 'select * from test.subscribers'
+--------+------------+-----------+
| sub_no | first_name | last_name |
+--------+------------+-----------+
|    500 | Billy      | Joel      |
|   1500 | Arthur     | Askey     |
|   5000 | Billy      | Fish      |
|  17542 | Bobby      | Ball      |
|     22 | Billy      | Bob       |
|   8372 | Banana     | Man       |
|  93846 | Bill       | Ben       |
|  15050 | John       | Smith     |
+--------+------------+-----------+

[[email protected] myfab]$ mysql -h 192.168.56.107 -P3306 -uroot -e 'select * from test.subscribers'
+--------+------------+-----------+
| sub_no | first_name | last_name |
+--------+------------+-----------+
|     72 | Billy      | Fish      |
|  15000 | Jimmy      | White     |
|   5006 | Andy       | Pandy     |
|  83467 | Tommy      | Cannon    |
+--------+------------+-----------+

The final example application code reads the row for each of the records that have been added, the key thing to note here is that the mode property for the connection has been set to fabric.MODE_READONLY so that the Fabric-aware Python connector knows that it can load balance requests over the Secondaries within the HA groups rather than sending everything to the Primary.


[[email protected] myfab]$ cat read_table_shards2.py
import mysql.connector
from mysql.connector import fabric

def find_subscriber(conn, sub_no):
    conn.set_property(tables=["test.subscribers"], key=sub_no, mode=fabric.MODE_READONLY)
    cur = conn.cursor()
    cur.execute(
        "SELECT first_name, last_name FROM subscribers "
        "WHERE sub_no = %s", (sub_no, )
        )
    for row in cur:
        print row

conn = mysql.connector.connect(
    fabric={"host" : "localhost", "port" : 32274, "username": "admin", "password" : "admin"},
    user="root", database="test", password="",
    autocommit=True
    )

find_subscriber(conn, 22)
find_subscriber(conn, 72)
find_subscriber(conn, 500)
find_subscriber(conn, 1500)
find_subscriber(conn, 8372)
find_subscriber(conn, 5000)
find_subscriber(conn, 5006)
find_subscriber(conn, 93846)
find_subscriber(conn, 15000)
find_subscriber(conn, 15050)
find_subscriber(conn, 17542)
find_subscriber(conn, 83467)


[[email protected] myfab]$ python read_table_shards.py
(u'Billy', u'Bob')
(u'Billy', u'Fish')
(u'Billy', u'Joel')
(u'Arthur', u'Askey')
(u'Banana', u'Man')
(u'Billy', u'Fish')
(u'Andy', u'Pandy')
(u'Bill', u'Ben')
(u'Jimmy', u'White')
(u'John', u'Smith')
(u'Bobby', u'Ball')
(u'Tommy', u'Cannon')

7. Conclusion

MySQL Fabric is an extensible framework for managing farms of MySQL Servers and enabling MySQL Connectors to get transactions and queries to the most appropriate server while hiding the topology of the server farm from the application.

The intent is that developers can focus on high value activities such as adding new features to their applications rather than spending time on the platform plumbing – that can now be handled by MySQL Fabric.

The first applications supported by MySQL Fabric are High Availability (built on top of MySQL Replication) and sharding-based scale-out. Over time we hope to add new options to these applications (for example, alternate HA technologies) as well as completely new applications. We look forward to hearing what users would like us to add as well as what they build for themselves.

 

[Source:- Dev.msql]

MySQL Cluster 7.4 GA: 200 Million QPS, Active-Active Geographic Replication and more

Highlights

The MySQL team at Oracle are excited to announce the General Availability of MySQL Cluster 7.4, in other words – it’s now ready for production workloads.

This is a release which takes what was already great about MySQL Cluster (real-time performance through memory-optimized tables, linear scale-out with transparrent sharding and cross-shard joins, High Availability and SQL as well as NoSQL interfaces) and makes it even faster, easier to manage and simpler to run accross geographies.

Specifically, it includes the following features:

  • Performance
    • 200 Million NoSQL Reads/Sec
    • 2.5 Million SQL Ops/Sec
    • 50% Faster Reads
    • 40% Faster Mixed Read/Write Transactions
  • Active-Active Geographic Replication
    • Active-Active Geographic Redundancy
    • Automated Conflict Detection/Resolution
  • Management
    • 5X Faster Maintenance Ops
    • Detailed Reporting

Performance Enhancements

MySQL Cluster 7.4 builds on the huge performance and scalability improvements delivered in MySQL Cluster 7.3. This release has focussed on performance improvements to help two types of workload:

  • OLTP (On-Line Transaction Processing): Memory-optimized tables provide sub-millisecond low latency and extreme levels of concurrency for OLTP workloads while still providing durability; they can also be used alongside disk-based tables
  • Ad-hoc Searches: MySQL Cluster has increased the amount of parallelism that can be used when performing a table scan – providing a significant speed-up when performing searches on un-indexed columns. Note that the huge speedup of joins in earlier releases has also made MySQL Cluster much more suitable for running analytics

A number of benchmarks have been run to assess:

  • How MySQL Cluster 7.4 performance compares with previous releases
  • How SQL performance scales as more data nodes are added
  • How NoSQL performance scales as more data nodes are added

Benchmarking Performance against earlier releases

MySQL Cluster 7.4 read performance compared with earlier releasesFigure 1: MySQL Cluster 7.4, 50% Faster Reads

The Sysbench benchmark tool has been used to perform an apples-to-apples comparison of how a single data node’s performance increases from MySQL Cluster 7.2, through MySQL Cluster 7.3 and MySQL Cluster 7.4. The tests were performed using a 48 core/96 thread machine (also demonstrating how well MySQL Cluster can now scale up with large numbers of cores).

As can be seen in Figure 1 there is a 1.5X performance improvement over MySQL Cluster 7.3 and an even larger improvement over 7.2.

Note that table scans experience a particularly good speedup.

MySQL Cluster 7.4 read/write performance compared with earlier releasesFigure 2: MySQL Cluster 7.4, 40% Faster Read/Writes

Figure 2 illustrates the same tests but for the Sysbench read/write SQL benchmark. In this case, a 1.4X performance improvement over MySQL Cluster 7.3 is recorded. Again, the improvement over MySQL Cluster 7.2 is even higher.

Benchmarking Scaling SQL Performance

MySQL Cluster 7.4 SQL read performance with DBT2Figure 3: 2.5 Million SQL Read Operations per Second

The DBT2 Benchmark has been used to assess how well SQL performance scales as more data nodes are added. As can be in Figure 3 the scaling of SQL reads is almost linear and with 16 data nodes, a throughput of 2.5 Million SQL operations per second is achieved. This equates to around 5 Million Transactions Per Second or 2.2 Million NewOnly TPM.

This benchmark was performed with each data node running on a dedicated 56 thread Intel E5-2697 v3 (Haswell) machine.

Benchmarking Scaling NoSQL Performance

MySQL Cluster 7.4 NoSQL read performance with flexAsynchFigure 4: 200 Million NoSQL Read Operations per Second

The flexAsynch benchmark has been used to measure how NoSQL performance scales as more data nodes are added to the cluster. These tests were performed on the same hardware as the DBT2 benchmark above but scaled out to 32 data nodes (out of a maximum supported 48).

The results are shown in Figure 4 and again it can be observed that the scaling is virtually linear. At 32 data nodes, the throughput hits 200 Million NoSQL Queries Per Second.

Note that the latest results and a more complete description of the tests can be found at the MySQL Cluster Benchmark page.

Full Active-Active Geographic Replication

MySQL Cluster provides Geographic Replication, allowing the same data to be accessed in clusters located in data centers separated by arbitrary distances. This reduces the effects of geographic latency by pushing data closer to the user, as well as providing a capability for geographic redundancy and disaster recovery.

Geographic replication is designed around an Active/Active technology, so if applications are attempting to update the same row on different clusters at the same time, the conflict can be detected and resolved. This ensures that each site can actively serve read and write requests while maintaining data consistency across the clusters. It also eliminates the overhead of having to provision and run passive hardware at remote sites.

If replicating between a single pair of clusters then Active-Active (update anywhere) replication has become significantly simpler and more complete in recent releases, culminating in the final MySQL Cluster 7.4 solution with these added advantages:

  • Developers need to make no changes to the application logic or tables
  • Conflict-triggered rollbacks can be made to whole transactions rather than just individual operations
  • Transactions that are dependent on rolled-back transactions can also be rolled back
  • Conflicts involving reads, writes and deletes can all be detected and resolved

These enhancements make it much simpler and safer to deploy globally scaled services across data centers.

MySQL Cluster allows bi-directional replication between two (or more) clusters. Replication within each cluster is synchronous but between clusters it is asynchronous which means the following scenario is possible:

Conflict with asynchronous replication
Site A Replication Site B
x == 10 x == 10
x = 11 x = 20
— x=11 –> x == 11
x==20 <– x=20 —

In this example a value (column for a row in a table) is set to 11 on site A and the change is queued for replication to site B. In the mean time, an application sets the value to 20 on site B and that change is queued for replication to site A. Once both sites have received and applied the replicated change from the other cluster site A contains the value 20 while site B contains 11 – in other words the databases are now inconsistent.

A description of how MySQL Cluster detects and then resolves such conflicts can be found in this article, together with a worked example of configuring and testing the conflict detection and resolution.

Faster Maintenance Activities

In typical systems, around 30% of all downtime is attributable to scheduled maintenance activities – for a truly High Availability solution, that downtime must be avoided. MySQL Cluster supports all of the following events as online operations, ensuring the database continues to provide service:

  • Scaling the cluster by adding new nodes
  • Updating the schema with new columns, tables and indexes
  • Re-sharding of tables across data nodes to allow better data distribution
  • Performing back-up operations
  • Upgrading or patching the underlying hardware and operating system
  • Upgrading or patching MySQL Cluster, with full online upgrades between releases

Many of these operations require one or more nodes in the cluster to be restarted (when multiple nodes are restarted in such an order that service is never lost, it is referred to as a rolling restart) and the time taken for these maintenance activities tend to be dominated by the restarting of data nodes. By increasing the amount of parallelism used (while also guarding against impacting the cluster by consuming too many resources) the data node restarts are 5x faster in MySQL Cluster 7.4.

The result to the use of such a significant speedup is twofold:

  • Less time spent by the administrator
  • More activities can be performed within a single maintenance window and full redundancy can be re-established much sooner

Note that the maintenance activities performed by MySQL Cluster Manager also benefit from these optimisations.

Enhanced Reporting

MySQL Cluster presents a lot of monitoring information through the ndbinfo database and in 7.4 we’ve added some extra information on how memory is used for individual tables and how operations are distributed.

Extra Memory Reporting

MySQL Cluster allocates all of the required memory when a data node starts and so any information on memory usage from the operating system is of limited use and provides no clues as to how memory is used within the data nodes – for example, which tables are using the most memory. Also, as this is a distributed database, it is helpful to understand whether a particular table is using a similar amount of memory in each data node (if not then it could be that a better partitioning/sharding key could be used). Finally, when rows are deleted from a table, the memory for those rows would typically remain allocated against that table and so it is helpful to understand how many of these empty slots are available for use by new rows in that table. MySQL Cluster 7.4 introduces a new table – ndbinfo.memory_per_fragment – that provides that information.

For example; to see how much memory is being used by each data node for a particular table…

mysql> CREATE DATABASE clusterdb;USE clusterdb;
mysql> CREATE TABLE simples (id INT NOT NULL AUTO_INCREMENT PRIMARY KEY) ENGINE=NDB;
mysql> SELECT node_id AS node, fragment_num AS frag, \
        fixed_elem_alloc_bytes alloc_bytes, \
        fixed_elem_free_bytes AS free_bytes, \
        fixed_elem_free_rows AS spare_rows \
        FROM ndbinfo.memory_per_fragment \
        WHERE fq_name LIKE '%simples%';
+------+------+-------------+------------+------------+
| node | frag | alloc_bytes | free_bytes | spare_rows |
+------+------+-------------+------------+------------+
|    1 |    0 |      131072 |       5504 |        172 |
|    1 |    2 |      131072 |       1280 |         40 |
|    2 |    0 |      131072 |       5504 |        172 |
|    2 |    2 |      131072 |       1280 |         40 |
|    3 |    1 |      131072 |       3104 |         97 |
|    3 |    3 |      131072 |       4256 |        133 |
|    4 |    1 |      131072 |       3104 |         97 |
|    4 |    3 |      131072 |       4256 |        133 |
+------+------+-------------+------------+------------+

When you delete rows from a MySQL Cluster table, the memory is not actually freed up and so if you check the existing memoryusage table you won’t see a change. This memory will be reused when you add new rows to that same table. In MySQL Cluster 7.4, it’s possible to see how much memory is in that state for a table…

mysql> SELECT node_id AS node, fragment_num AS frag, \
        fixed_elem_alloc_bytes alloc_bytes, \
        fixed_elem_free_bytes AS free_bytes, \
        fixed_elem_free_rows AS spare_rows \
        FROM ndbinfo.memory_per_fragment \
        WHERE fq_name LIKE '%simples%';
+------+------+-------------+------------+------------+
| node | frag | alloc_bytes | free_bytes | spare_rows |
+------+------+-------------+------------+------------+
|    1 |    0 |      131072 |       5504 |        172 |
|    1 |    2 |      131072 |       1280 |         40 |
|    2 |    0 |      131072 |       5504 |        172 |
|    2 |    2 |      131072 |       1280 |         40 |
|    3 |    1 |      131072 |       3104 |         97 |
|    3 |    3 |      131072 |       4256 |        133 |
|    4 |    1 |      131072 |       3104 |         97 |
|    4 |    3 |      131072 |       4256 |        133 |
+------+------+-------------+------------+------------+
mysql> DELETE FROM clusterdb.simples LIMIT 1;
mysql> SELECT node_id AS node, fragment_num AS frag, \
        fixed_elem_alloc_bytes alloc_bytes, \
        fixed_elem_free_bytes AS free_bytes, \
        fixed_elem_free_rows AS spare_rows \
        FROM ndbinfo.memory_per_fragment \
        WHERE fq_name LIKE '%simples%';
+------+------+-------------+------------+------------+
| node | frag | alloc_bytes | free_bytes | spare_rows |
+------+------+-------------+------------+------------+
|    1 |    0 |      131072 |       5504 |        172 |
|    1 |    2 |      131072 |       1312 |         41 |
|    2 |    0 |      131072 |       5504 |        172 |
|    2 |    2 |      131072 |       1312 |         41 |
|    3 |    1 |      131072 |       3104 |         97 |
|    3 |    3 |      131072 |       4288 |        134 |
|    4 |    1 |      131072 |       3104 |         97 |
|    4 |    3 |      131072 |       4288 |        134 |
+------+------+-------------+------------+------------+

As a final example, we can check whether a table is being evenly sharded accross the data nodes (in this case a realy bad sharding key was chosen)…

mysql> CREATE TABLE simples (id INT NOT NULL AUTO_INCREMENT, \
        species VARCHAR(20) DEFAULT "Human",
        PRIMARY KEY(id, species)) engine=ndb PARTITION BY KEY(species);

// Add some data

mysql> SELECT node_id AS node, fragment_num AS frag, \
        fixed_elem_alloc_bytes alloc_bytes, \
        fixed_elem_free_bytes AS free_bytes, \
        fixed_elem_free_rows AS spare_rows \
        FROM ndbinfo.memory_per_fragment \
        WHERE fq_name LIKE '%simples%';
+------+------+-------------+------------+------------+
| node | frag | alloc_bytes | free_bytes | spare_rows |
+------+------+-------------+------------+------------+
|    1 |    0 |           0 |          0 |          0 |
|    1 |    2 |      196608 |      11732 |        419 |
|    2 |    0 |           0 |          0 |          0 |
|    2 |    2 |      196608 |      11732 |        419 |
|    3 |    1 |           0 |          0 |          0 |
|    3 |    3 |           0 |          0 |          0 |
|    4 |    1 |           0 |          0 |          0 |
|    4 |    3 |           0 |          0 |          0 |
+------+------+-------------+------------+------------+

Extra Operations Reporting

To ensure that resources are being used effectively, it is very helpful to understand the each table is being access (how frequently and for what types of operations). To support this, the ndbinfo.operations_per_fragment table is provided. For example, the data in this table would let you identify that a large number of full table scans are performed on a particular table.

It is also important to identify if there are any hotspots where a disproportionate share of the queries for a table are hitting a particular fragment/data node. Again,ndbinfo.operations_per_fragment provides this information.

As an example of how to use some of the data from this table, a simple table is created and populated and then ndbinfo.operations_per_fragment is used to monitor how many Primary Key reads and table scans are performed:



mysql> CREATE TABLE simples (id INT AUTO_INCREMENT PRIMARY KEY, time TIMESTAMP) ENGINE=NDB;

mysql> SELECT fq_name AS 'Table', node_id AS 'Data Node', tot_key_reads AS 'Reads',
 tot_frag_scans AS 'Scans' FROM ndbinfo.operations_per_fragment WHERE fq_name LIKE '%simples';
+-----------------------+-----------+-------+-------+
| Table                 | Data Node | Reads | Scans |
+-----------------------+-----------+-------+-------+
| clusterdb/def/simples |         3 |     0 |     1 |
| clusterdb/def/simples |         3 |     0 |     0 |
| clusterdb/def/simples |         4 |     0 |     0 |
| clusterdb/def/simples |         4 |     0 |     1 |
+-----------------------+-----------+-------+-------+

mysql> INSERT INTO simples VALUES ();  # Repeated several times
mysql> SELECT * FROM simples;
+----+---------------------+
| id | time                |
+----+---------------------+
|  7 | 2015-01-22 15:12:42 |
|  8 | 2015-01-22 15:12:58 |
+----+---------------------+

mysql> SELECT fq_name AS 'Table', node_id AS 'Data Node', tot_key_reads AS 'Reads',
 tot_frag_scans AS 'Scans' FROM ndbinfo.operations_per_fragment WHERE fq_name LIKE '%simples';
+-----------------------+-----------+-------+-------+
| Table                 | Data Node | Reads | Scans |
+-----------------------+-----------+-------+-------+
| clusterdb/def/simples |         3 |     0 |     2 |
| clusterdb/def/simples |         3 |     0 |     0 |
| clusterdb/def/simples |         4 |     0 |     0 |
| clusterdb/def/simples |         4 |     0 |     2 |
+-----------------------+-----------+-------+-------+


mysql> SELECT * FROM simples WHERE id=11;
+----+---------------------+
| id | time                |
+----+---------------------+
| 11 | 2015-01-22 15:12:59 |
+----+---------------------+

mysql> SELECT fq_name AS 'Table', node_id AS 'Data Node', tot_key_reads AS 'Reads',
 tot_frag_scans AS 'Scans' FROM ndbinfo.operations_per_fragment WHERE fq_name LIKE '%simples';
+-----------------------+-----------+-------+-------+
| Table                 | Data Node | Reads | Scans |
+-----------------------+-----------+-------+-------+
| clusterdb/def/simples |         3 |     0 |     2 |
| clusterdb/def/simples |         3 |     0 |     0 |
| clusterdb/def/simples |         4 |     0 |     0 |
| clusterdb/def/simples |         4 |     1 |     2 |
+-----------------------+-----------+-------+-------+

Note that there are two rows listed for each data node but only one row for each has non-zero values; this is because each data node holds the primary fragment for one of the partitions and the secondary fragment for the other – all operations are performed only on the active fragments. This is made clearer if the fragment number is included in the query:

mysql> SELECT fq_name AS 'Table', node_id AS 'Data Node',
 fragment_num AS 'Fragment', tot_key_reads AS 'Reads', tot_frag_scans AS 'Scans'
 FROM ndbinfo.operations_per_fragment WHERE fq_name LIKE '%simples';
+-----------------------+-----------+----------+-------+-------+
| Table                 | Data Node | Fragment | Reads | Scans |
+-----------------------+-----------+----------+-------+-------+
| clusterdb/def/simples |         3 |        0 |     0 |     2 |
| clusterdb/def/simples |         3 |        1 |     0 |     0 |
| clusterdb/def/simples |         4 |        0 |     0 |     0 |
| clusterdb/def/simples |         4 |        1 |     1 |     2 |
+-----------------------+-----------+----------+-------+-------+

MySQL Cluster GUI-Based Auto-Installer

The Auto-Installer makes it simple for DevOps teams to quickly configure and provision highly optimized MySQL Cluster deployments. Developers can spend more time innovating in their code, rather than figuring out how to install, configure and start the database.

Implemented with a standard HTML GUI and Python-based web server back-end, the Auto-Installer intelligently configures MySQL Cluster based on application requirements and available hardware resources, stepping users through each stage of cluster creation:

  1. Workload Optimized: On launching the browser-based installer, users can specify the throughput, latency and write-load characteristics of their application
  2. Auto-Discovery: The Installer automatically discovers the underlying resources available from the local and remote servers that will make up the Cluster, including CPU architecture, cores and memory.

With these parameters, the installer creates optimized configuration files and starts the cluster.

Automated Tuning and Configuration of MySQL Cluster

[Source:- Dev.msql]

Taking the new MySQL 5.7 JSON features for a test drive

MySQL 5.7 introduces both a new native JSON datatype, and a set of SQL functions to be able to manipulate and search data in a very natural way on the server-side. Today I wanted to show a simple of example of these features in action using sample data from SF OpenData.

Importing Sample Data

Having good sample data is useful, because it helps you self-validate that results are accurate. It also helps provide good data distribution, which is important when adding indexes.

My chosen data set from SF OpenData is the most popular item under “Geographic Locations and Boundaries” and contains approximately 200K city lots. The first step is to download and import it into MySQL:

Here is an example what each one of the features (parcel of land) looks like:

In this case all 200K documents do follow a common format, but I should point out that this is not a requirement. JSON is schema-less :)

Example Queries

Query #1: Find a parcel of land on Market street, one of the main streets in San Francisco:

Using the short hand JSON_EXTRACT operator (->) I can query into a JSON column in a very natural way. The syntax "$.properties.STREET" is what we call a JSON path, and for those familiar with javascript I like to compare this to a CSS selector similar to what you would use with JQuery.

To learn more about the JSON path syntax, I recommend checking out our manual page, or this blog post by Roland Bouman.

Query #2: Find any parcels of land that do not specify a street:

With JSON being schemaless, this finds the documents which do not have the expected structure. In this example we can see that all documents have a $.properties.STREETspecified, and thus the query returns zero results.

Comparing the JSON type to TEXT

In this example I am running a query which deliberately needs to access all 200K JSON documents. This could be considered a micro-benchmark, as it does not quite reflect what you would experience in production, where you will often have indexes:

To explain what is happening here in more detail:

  • For simplicity, I’ve ensured that in both examples the dataset fits in memory.
  • The JSON functions, including the short-hand json_extract() operator (->) will work on both a native JSON data type, as well TEXT/BLOB/VARCHAR data types. This is very useful because it provides a nice upgrade for users prior to MySQL 5.7 who frequently already store JSON.
  • We can see that the native JSON datatype is indeed about 10x faster than TEXT – 1.25 seconds versus 12.85 seconds. This can be explained because the native type does not have to do any parsing or validation of the data, and it can retrieve elements of a JSON document very efficiently.

Conclusion

Hopefully this serves as a useful example of importing sample JSON data, and running a few sample queries. In my next post I take this a step further by showing how you can index JSON data by using virtual columns.

 

[Source:- Mysqlserverteam]

General Tablespaces in MySQL 5.7 – Details and Tips

 

 

InnoDB in MySQL 5.7 introduced for the first time the ability to create a general tablespace and assign multiple tables to it.  These tablespaces can be assigned anywhere on the system.  They can even be assigned a smaller block size so that they can contain compressed tables that use that size as their key_block_size.

You can create a new tablespace with a command like this;

If the current innodb-page-size is 16KB then the BLOCK_SIZE phrase is optional.

A few comments about datafile names
Notice that the extension .ibd is added to the file name.  This is required. InnoDB will only accept a file name that ends with .ibd.  This helps to ensure that the filename is the one you want since not just anything can be put after ADD DATAFILE.  It also enforces the convention that all InnoDB datafiles other than the system tablespace will end in .ibdwhich helps them to be recognized.

Notice also that there is no path on the datafile above.  Relative paths like this will be relative to the datadir which is found in your configuration file. This is the same location as the system tablespace and log files.

You can also use an absolute path to create the file anywhere else on your system.  There are two restrictions concerning where a general tablespace can be located:

  1. It cannot be on the root directory.  Our design engineers thought it would be wise to prevent this. It comes mainly from the unix perspective but it is generally a good idea on Windows also.
  2. A general tablespace datafile cannot be located in a directory under the datadir.  This is where datafiles for file-per-table tablespaces are located.  In MySQL, traditionally, directories under the datadir are there to contain files related to a database or schema.  These datafiles and directories are created automatically when you create a table while innodb-file-per-table is ON.  The file name is the same as the tablename with an .ibd extension added.

It is possible to create a tablespace with the same datafile name as a file-per-table datafile.  For example:

The result will be two files named new.ibd.  The general tablespace datafile will be located in the datadir and the file-per-table datafile will be located in a directory called ‘new’ under the datadir.

A word of advice though… Try to give unique names to all your database objects.  A future version of InnoDB may prevent similar datafile names like the two above.  Or it might allow you to start associating a general tablespace with a database which could cause a conflict somehow.  It is much wiser to name different objects differently to avoid any possible conflicts.

General Tablespace Portability
You can move a file-per-table tablespace from one system to another by  following the directions here:  http://dev.mysql.com/doc/refman/5.7/en/tablespace-copying.html

This method uses the following commands:

Discard and Import have never been supported for tables in the system tablespace.  And it is not supported in 5.7 for general tablespaces either.  Since a general tablespace can share multiple tables just like the system tablespace, it is not as easy to transport a datafile from one system to another.

Choosing a tablespace for your table
A table can be created or altered into a general tablespace, a file-per-table tablespace or even the system tablespace by using the TABLESPACE phrase on any CREATE TABLE or ALTER TABLE statement.

This gives you the ability to explicitly choose the tablespace you want for your table and even the ability to move your table around.  You can move any table from any tablespace into any other tablespace with the TABLESPACE phase on the ALTER TABLE statement.  This means that for the first time, you can move a table into the system tablespace.  Also, you can chose to use file-per-table independent of the innodb-file-per-table setting.

So if you were to do this;

the table would be created in its own file-per-table tablespace.

Likewise, this would create the table in the system tablespace;

The tablespace name is a SQL identifier
Notice that there are no quote marks around the tablespace name in these examples.  That is because the tablespace name is a SQL identifier.  The implications are that you can also use the backtick quote marks to enclose this name and that it is always evaluated in a CASE SENSITIVE way.

This means that you can create multiple tablespaces with the same name, but in different cases, like this;

Once again, a word to the wise, try not to name your tablespaces the same with only differences in case.

Reserved Tablespace Names
There are three ‘reserved’ tablespace names that have special meaning, two of which were mentioned earlier:

  1. innodb_file-per-table
  2. innodb_system
  3. innodb_temporary

In 5.7, you can use the first two as I have already shown.  The third one is not available to use.  You do not need to use TABLESPACE=innodb_temporary to put a table into the temporary tablespace.  Just use CREATE TEMPORARY TABLE ...;.

These reserved tablespace names are case sensitive so it is possible to do this;

But once again, please don’t!  You are better off with unique tablespace names.

Conclusion
I hope this discussion has been useful for you to understand General Tablespaces in MySQL 5.7.  There are more tablespace features to come in future releases.

[Source:- Mysqlserverteam]

 

MySQL creator says 14,000 against Oracle-Sun deal

Michael Widenius, the creator of the MySQL database and a vocal opponent of Oracle Corp’s USD 7 billion takeover of Sun Microsystems Inc, has handed 14,000 signatures opposing the deal to regulators in Europe, China and Russia. Widenius, one of the most respected developers of open-source software, left Sun last year to set up database firm Monty Program Ab, which competes directly with MySQL. The European Commission initially objected to Oracle’s acquisition of Sun, saying it was concerned Oracle’s takeover of the MySQL database could hurt competition in that market. But the Commission signaled in mid-December that it would likely clear the deal after some of Oracle’s largest customers said they believed the takeover would not hurt competition. Since then, Oracle has said it expects to win unconditional EU clearance to close the deal by the end of January. Widenius, who delivered the signatures on Monday, said he would continue to gather signatures until the commission makes a final ruling, which is due by Jan. 27. “Our signatories don’t have faith that Oracle could be a good steward of MySQL,” Widenius said in a statement. Still, Beau Buffier, a partner in the anti-trust practice of law firm Shearman & Sterling, said signature drives carry little weight with the commission. He said regulators generally want each person who weighs in on pending cases to provide specifics on how an acquisition might affect their particular business. “What you would need is detailed statements from significant developers,” he said. More than 5,000 signatures are from self-employed developers and more than 3,000 from employees of companies and other organizations using MySQL, according to Widenius. He did not disclose the names of the people who signed the petition. The signatures were gathered during the first week of the campaign and were delivered to the European Commission and other European institutions, including the European Parliament and the competition authorities of the 27 member states, as well as to the Chinese Ministry of Commerce and the Russian Federal Antimonopoly Service. Officials with Oracle, the world’s No. 3 software maker, and Sun, the No. 4 server maker, declined comment. The acquisition will transform Oracle from a maker of software into a technology powerhouse that sells computers and storage equipment preloaded with its programs. Oracle Chief Executive Larry Ellison is betting the combination will give his company an edge over rivals such as IBM Corp, Hewlett-Packard Co, Microsoft Corp and EMC Corp. Sun bought MySQL for USD 1 billion in 2008.

[Source:- Moneycontrol]