MySQL Partitioning and Its Uses

Page last updated on 2011 / 16 / 08

MySQL partitioning has a number of beneficial uses that are available in more recent MySQL releases.

Its main benefits are:

To quote the above link...

[MySQL Partitioning] enables you to distribute portions of individual tables across a file system according to rules which you can set largely as needed. In effect, different portions of a table are stored as separate tables in different locations.

Some queries can be greatly optimized in virtue of the fact that data satisfying a given WHERE clause can be stored only on one or more partitions, thereby excluding any remaining partitions from the search.

On the latter point, some MySQL users may have came across performance issues with larger tables, and felt the only solution was to duplicate the table structure across extra tables, to spread out the size of the table. If you have done this, partitioning is the answer to your problem and removes the need for you to consider duplicate tables. Let MySQL and its optimizer do the work for you.

To see if you can take advantage of MySQL partitioning, you must have a later version of MySQL 5.1; you can check by performing this query:

SHOW VARIABLES LIKE '%partition%'

MySQL only supports horizontal partitioning at the minute (you choose column(s)) to create your partition rule) and there are no plans for vertical partitioning (by rows) at this time for version 5.1.

I highly recommend checking out the documentation concerning partitioning, which is about 20 pages long and should take you an hour or two to digest.

Here are some working examples of partitioning in action:

Example 1: IP to Location Lookups

Many websites like to identify the country of origin for IP addresses, and have a MySQL table to lookup the IP and find its corresponding country/location. With the following table setup, partitioning can be used to optimize queries performed against the table(s).

CREATE TABLE IF NOT EXISTS `countries` (

`id` smallint( 5 ) unsigned NOT NULL AUTO_INCREMENT ,
`location` varchar( 255 ) NOT NULL PRIMARY KEY ( `id` )
) ENGINE = MYISAM ;

CREATE TABLE IF NOT EXISTS `ipranges` (

`lowerbound` int( 10 ) unsigned NOT NULL ,
`upperbound` int( 10 ) unsigned NOT NULL ,
`countryid` smallint( 5 ) unsigned NOT NULL
) ENGINE = MYISAM /*!50100 PARTITION BY HASH (MOD(lowerbound,64)) */;

The above tables are normalised IP range data, with a list of IP ranges in integer format in the ipranges table, that contains a countryid that corresponds to the id value in the countries table. lowerbound and upperbound are IP ranges in numeric format.

A typical query would be something like this:

SELECT country
FROM iplocation_ipranges
INNER JOIN iplocation_countries ON iplocation_ipranges.countryid = iplocation_countries.id
WHERE INET_ATON( '97.74.144.84' )
BETWEEN lowerbound
AND upperbound

With any given IP address, you can lookup which range the IP falls within (between lowerbound and upperbound) to get the corresponding country.

I have used hash partitioning on the ipranges table to create 64 partitions using a modulus function on the lowerbound column. This means when a query is performed against the ipranges table and a constant is used to query the lowerbound column, MySQL attempts to optimize the query.

You can see when MySQL is using partitions to optimize a query by using the EXPLAIN PARTITIONS syntax, e.g.:

EXPLAIN PARTITIONS SELECT country
FROM iplocation_ipranges
INNER JOIN iplocation_countries ON iplocation_ipranges.countryid = iplocation_countries.id
WHERE INET_ATON( '97.74.144.84' )
BETWEEN lowerbound
AND upperbound

Which outputs the following...

id select_type table partitions type possible_keys key key_len ref rows Extra
1 SIMPLE ipranges p0 ALL NULL NULL NULL NULL 5 Using where
1 SIMPLE countries NULL eq_ref PRIMARY PRIMARY 2 ipranges.countryid 1  

The above indicates that MySQL only needed to look up 1 of the 64 partitions in order to find results that match the query. When your partitions have data spread fairly equally across them, as is the case in this example, it means that 1/64th of the records in a table have to be queried in order to attain a result, therefore partitioning can give you a huge performance boost.

By further adding an index to the lowerbound and upperbound columns, you can further speed up your query in this example.

Example 2: URL Shortener

I have used partitioning in a URL shortener script available on this site.

Summary

Generally, partitioning is a new MySQL feature and is continually being improved upon. There are a number of practical uses for it:

I have only skimmed the surface of the subject here, I would highly recommend reading the MySQL manual and trying a couple of examples yourself.


Next Article
Tip: Storing MD5 Values




Tweet