MySQL 5.6 offers improved partition maintenance through “partition exchange”. Even if not yet available, dbi services performed some tests on the MySQL Community Edition 5.6.2 m5 release (still under development).

Starting with release 5.1, MySQL offers table partitioning. As for any new feature, partitioning has been improved over the last years and now comes with quite a complete scope of functionalities:

  • Range, list, hash, composite partition types are supported
  • Advanced partitioning maintenance operations are also supported: merge, split (called reorganize), repair, optimize, etc.

Among several missing features, one of them was quite “critical”. Frequently used in many data warehouse environments, “partition exchange” allows to exchange an empty partition with an existing table. This was a pity since we know how easy it is to quickly load data from a external/legacy system on a MySQL server. This new feature has been introduced in MySQL 5.6 (still under development and not yet “GA”: generally available) and tested by dbi services.

Let’s illustrate the feature through an example. At the end of the year, we receive a csv file with the sales information to be loaded on the data warehouse. This sales information contains the salesman_id, the product_id, the amount sold and the date of the sale.

To load this file in a temporary (staging) table is very easy with MySQL. First of all let’s have a look at the file:

head -5 /tmp/sales_range_2012.lst

To load this data, there is nothing easier. Loading this data is a straightforward process, we first create the staging table, then load the data with one MySQL statement:

mysql> CREATE TABLE temp_sales ( salesman_id INT, product_id  INT, sales_amount INT,  sales_date date) ENGINE=MyISAM;
 Query OK, 0 rows affected (0.00 sec)
mysql> LOAD DATA INFILE '/tmp/sales_range_2012.lst' INTO TABLE temp_sales FIELDS TERMINATED BY ',' LINES TERMINATED BY 'n';
 Query OK, 37787 rows affected (0.16 sec)
 Records: 37787  Deleted: 0  Skipped: 0  Warnings: 0

Our sales table (partitioned by range) looks like this:

mysql> show create table sales_range G
 *************************** 1. row ***************************
 Table: sales_range
 Create Table: CREATE TABLE `sales_range` (
 `salesman_id` int(11) DEFAULT NULL,
 `product_id` int(11) DEFAULT NULL,
 `sales_amount` int(11) DEFAULT NULL,
 `sales_date` date DEFAULT NULL
 /*!50100 PARTITION BY RANGE (YEAR(sales_date))
 (PARTITION sales_year_2008 VALUES LESS THAN (2009) ENGINE = MyISAM,
 PARTITION sales_year_2011 VALUES LESS THAN (2012) ENGINE = MyISAM) */
 1 row in set (0.00 sec)

At this stage some data “transformation” might occur based on MySQL PL-SQL in the sales_temp table (if needed).

Once the data is ready, MySQL 5.6 “partition exchange” might be used to load the sales data from 2012 in the 2012 partition.

First of all, the partition “sales_year_2012” should be created within the sales_range table:

mysql> ALTER TABLE sales_range ADD PARTITION (PARTITION sales_year_2012 VALUES LESS THAN (2013) ENGINE = MyISAM);
 Query OK, 0 rows affected (0.01 sec)
 Records: 0  Duplicates: 0  Warnings: 0

Since the data is already loaded and transformed in sales_temp, we just have to exchange sales_year_2012 and temp_sales:

mysql> ALTER TABLE sales_range EXCHANGE PARTITION sales_year_2012 WITH TABLE temp_sales;
 Query OK, 0 rows affected (0.11 sec)

That’s it! All data contained in the sales_temp table has been anchored to the sales_range partitioned table. Let’s have a look at this with the “explain” feature, selecting the 2012 sales, only the sales_year_2012 partition is scanned:

mysql> explain partitions select count(*) from sales_range where sales_date>= '2012-01-01';
 | id | select_type | table       | partitions      | type | possible_keys | key  | key_len | ref  +----+-------------+-------------+-----------------+------+---------------+------+---------+------+
 |  1 | SIMPLE      | sales_range | sales_year_2012 | ALL  | NULL          | NULL | NULL    | NULL +----+-------------+-------------+-----------------+------+---------------+------+---------+------+
 1 row in set (0.00 sec)

What’s next?

As mentioned in the MySQL documentation the next important feature for partitioning usage will be the possibility to perform parallel queries while computing SQL aggregations – i.e sum() – over several partitions. Several threads in parallel should read the several partitions to provide the whole result faster. MySQL is working on this topic.