| Revenue Source Veteran
Join Date: Oct 2005 Posts: 9,144 Jack of All Trades
CyberSpace
| Possible optimization for sort_merge and UNION ORDER BY LIMIT -
09-18-2007
Every so often you need to perform sort results retrieved from MySQL when your WHERE clause goes beyound col=const values which would allow MySQL to still use second portion of the index for the order by. Ranges as well as IN lists make this optimization impossible, not even speaking about index merge optimization. Lets look at this example: PLAIN TEXT
SQL: - [FONT='Courier New', Courier, monospace]CREATE TABLE `utest` ([/font]
- [FONT='Courier New', Courier, monospace] `c1` int(10) UNSIGNED NOT NULL,[/font]
- [FONT='Courier New', Courier, monospace] `c2` int(10) UNSIGNED NOT NULL,[/font]
- [FONT='Courier New', Courier, monospace] `ord` int(10) UNSIGNED NOT NULL,[/font]
- [FONT='Courier New', Courier, monospace] KEY `c1` (`c1`,`ord`),[/font]
- [FONT='Courier New', Courier, monospace] KEY `c2` (`c2`,`ord`)[/font]
- [FONT='Courier New', Courier, monospace]) ENGINE=MyISAM DEFAULT CHARSET=latin1[/font]
- [FONT='Courier New', Courier, monospace] [/font]
- [FONT='Courier New', Courier, monospace]mysql> EXPLAIN SELECT * FROM utest WHERE c1=5 OR c2=5 ORDER BY ord DESC LIMIT 10 \G[/font]
- [FONT='Courier New', Courier, monospace]*************************** 1. row ***************************[/font]
- [FONT='Courier New', Courier, monospace] id: 1[/font]
- [FONT='Courier New', Courier, monospace] select_type: SIMPLE[/font]
- [FONT='Courier New', Courier, monospace] TABLE: utest[/font]
- [FONT='Courier New', Courier, monospace] type: index_merge[/font]
- [FONT='Courier New', Courier, monospace]possible_keys: c1,c2[/font]
- [FONT='Courier New', Courier, monospace] KEY: c1,c2[/font]
- [FONT='Courier New', Courier, monospace] key_len: 4,4[/font]
- [FONT='Courier New', Courier, monospace] ref: NULL[/font]
- [FONT='Courier New', Courier, monospace] rows: 162380[/font]
- [FONT='Courier New', Courier, monospace] Extra: USING sort_union(c1,c2); USING WHERE; USING filesort[/font]
- [FONT='Courier New', Courier, monospace]1 row IN SET (0.00 sec) [/font]
As you can see MySQL 5.1.21 uses sort_merge to access the rows but it can't use it together with order by efficiently.
I'd say this is limitation as for this case you well could be retrieving sorted streams by the indexes and doing merge sort to get resulted rows in sorted order. Once this is implemented similar approach could be used for optimizing ORDER BY with IN.
This original query (in memory data) takes 1sec. Let's see how classic pre MySQL 5.0 solution - using UNION instead of single query works in this case: PLAIN TEXT
SQL: - [FONT='Courier New', Courier, monospace]mysql> EXPLAIN (SELECT * FROM utest WHERE c1=5) union (SELECT * FROM utest WHERE c2=5) ORDER BY ord DESC LIMIT 10 \G[/font]
- [FONT='Courier New', Courier, monospace]*************************** 1. row ***************************[/font]
- [FONT='Courier New', Courier, monospace] id: 1[/font]
- [FONT='Courier New', Courier, monospace] select_type: PRIMARY[/font]
- [FONT='Courier New', Courier, monospace] TABLE: utest[/font]
- [FONT='Courier New', Courier, monospace] type: ref[/font]
- [FONT='Courier New', Courier, monospace]possible_keys: c1[/font]
- [FONT='Courier New', Courier, monospace] KEY: c1[/font]
- [FONT='Courier New', Courier, monospace] key_len: 4[/font]
- [FONT='Courier New', Courier, monospace] ref: const[/font]
- [FONT='Courier New', Courier, monospace] rows: 108112[/font]
- [FONT='Courier New', Courier, monospace] Extra: [/font]
- [FONT='Courier New', Courier, monospace]*************************** 2. row ***************************[/font]
- [FONT='Courier New', Courier, monospace] id: 2[/font]
- [FONT='Courier New', Courier, monospace] select_type: UNION[/font]
- [FONT='Courier New', Courier, monospace] TABLE: utest[/font]
- [FONT='Courier New', Courier, monospace] type: ref[/font]
- [FONT='Courier New', Courier, monospace]possible_keys: c2[/font]
- [FONT='Courier New', Courier, monospace] KEY: c2[/font]
- [FONT='Courier New', Courier, monospace] key_len: 4[/font]
- [FONT='Courier New', Courier, monospace] ref: const[/font]
- [FONT='Courier New', Courier, monospace] rows: 54268[/font]
- [FONT='Courier New', Courier, monospace] Extra: [/font]
- [FONT='Courier New', Courier, monospace]*************************** 3. row ***************************[/font]
- [FONT='Courier New', Courier, monospace] id: NULL[/font]
- [FONT='Courier New', Courier, monospace] select_type: UNION RESULT[/font]
- [FONT='Courier New', Courier, monospace] TABLE: [/font]
- [FONT='Courier New', Courier, monospace] type: ALL[/font]
- [FONT='Courier New', Courier, monospace]possible_keys: NULL[/font]
- [FONT='Courier New', Courier, monospace] KEY: NULL[/font]
- [FONT='Courier New', Courier, monospace] key_len: NULL[/font]
- [FONT='Courier New', Courier, monospace] ref: NULL[/font]
- [FONT='Courier New', Courier, monospace] rows: NULL[/font]
- [FONT='Courier New', Courier, monospace] Extra: USING filesort[/font]
- [FONT='Courier New', Courier, monospace]3 rows IN SET (0.00 sec) [/font]
The query looks scare but in fact in completes in 1.05 sec not significantly faster than sort index merge in this case.
As the query time implies MySQL is not smart enough in this case to "dive into" the union and add ORDER BY ORD LIMIT 10 to individual queries.
What if we do int manually ? PLAIN TEXT
SQL: - [FONT='Courier New', Courier, monospace]mysql> EXPLAIN (SELECT * FROM utest WHERE c1=5 ORDER BY ord DESC LIMIT 10) union (SELECT * FROM utest WHERE c2=5 ORDER BY ord DESC LIMIT 10) ORDER BY ord DESC LIMIT 10 \G[/font]
- [FONT='Courier New', Courier, monospace]*************************** 1. row ***************************[/font]
- [FONT='Courier New', Courier, monospace] id: 1[/font]
- [FONT='Courier New', Courier, monospace] select_type: PRIMARY[/font]
- [FONT='Courier New', Courier, monospace] TABLE: utest[/font]
- [FONT='Courier New', Courier, monospace] type: ref[/font]
- [FONT='Courier New', Courier, monospace]possible_keys: c1[/font]
- [FONT='Courier New', Courier, monospace] KEY: c1[/font]
- [FONT='Courier New', Courier, monospace] key_len: 4[/font]
- [FONT='Courier New', Courier, monospace] ref: const[/font]
- [FONT='Courier New', Courier, monospace] rows: 108112[/font]
- [FONT='Courier New', Courier, monospace] Extra: USING WHERE[/font]
- [FONT='Courier New', Courier, monospace]*************************** 2. row ***************************[/font]
- [FONT='Courier New', Courier, monospace] id: 2[/font]
- [FONT='Courier New', Courier, monospace] select_type: UNION[/font]
- [FONT='Courier New', Courier, monospace] TABLE: utest[/font]
- [FONT='Courier New', Courier, monospace] type: ref[/font]
- [FONT='Courier New', Courier, monospace]possible_keys: c2[/font]
- [FONT='Courier New', Courier, monospace] KEY: c2[/font]
- [FONT='Courier New', Courier, monospace] key_len: 4[/font]
- [FONT='Courier New', Courier, monospace] ref: const[/font]
- [FONT='Courier New', Courier, monospace] rows: 54268[/font]
- [FONT='Courier New', Courier, monospace] Extra: USING WHERE[/font]
- [FONT='Courier New', Courier, monospace]*************************** 3. row ***************************[/font]
- [FONT='Courier New', Courier, monospace] id: NULL[/font]
- [FONT='Courier New', Courier, monospace] select_type: UNION RESULT[/font]
- [FONT='Courier New', Courier, monospace] TABLE: [/font]
- [FONT='Courier New', Courier, monospace] type: ALL[/font]
- [FONT='Courier New', Courier, monospace]possible_keys: NULL[/font]
- [FONT='Courier New', Courier, monospace] KEY: NULL[/font]
- [FONT='Courier New', Courier, monospace] key_len: NULL[/font]
- [FONT='Courier New', Courier, monospace] ref: NULL[/font]
- [FONT='Courier New', Courier, monospace] rows: NULL[/font]
- [FONT='Courier New', Courier, monospace] Extra: USING filesort[/font]
- [FONT='Courier New', Courier, monospace]3 rows IN SET (0.00 sec) [/font]
As you can see explain does not really change (it does not show if index is used for finding rows or sorting as well) but query speed goes to less than 0.01 sec.
So it is great optimization which you need to do manuall for the time being.
What is also interesting is the fact MySQL is unable to handle even basic UNION with limit (without order by) optimally - in creates result set for the union fully and when only takes 10 rows from it: PLAIN TEXT
SQL: - [FONT='Courier New', Courier, monospace]mysql> EXPLAIN (SELECT * FROM utest WHERE c1=5 ) union (SELECT * FROM utest WHERE c2=5 ) LIMIT 10 \G[/font]
- [FONT='Courier New', Courier, monospace]*************************** 1. row ***************************[/font]
- [FONT='Courier New', Courier, monospace] id: 1[/font]
- [FONT='Courier New', Courier, monospace] select_type: PRIMARY[/font]
- [FONT='Courier New', Courier, monospace] TABLE: utest[/font]
- [FONT='Courier New', Courier, monospace] type: ref[/font]
- [FONT='Courier New', Courier, monospace]possible_keys: c1[/font]
- [FONT='Courier New', Courier, monospace] KEY: c1[/font]
- [FONT='Courier New', Courier, monospace] key_len: 4[/font]
- [FONT='Courier New', Courier, monospace] ref: const[/font]
- [FONT='Courier New', Courier, monospace] rows: 108112[/font]
- [FONT='Courier New', Courier, monospace] Extra: [/font]
- [FONT='Courier New', Courier, monospace]*************************** 2. row ***************************[/font]
- [FONT='Courier New', Courier, monospace] id: 2[/font]
- [FONT='Courier New', Courier, monospace] select_type: UNION[/font]
- [FONT='Courier New', Courier, monospace] TABLE: utest[/font]
- [FONT='Courier New', Courier, monospace] type: ref[/font]
- [FONT='Courier New', Courier, monospace]possible_keys: c2[/font]
- [FONT='Courier New', Courier, monospace] KEY: c2[/font]
- [FONT='Courier New', Courier, monospace] key_len: 4[/font]
- [FONT='Courier New', Courier, monospace] ref: const[/font]
- [FONT='Courier New', Courier, monospace] rows: 54268[/font]
- [FONT='Courier New', Courier, monospace] Extra: [/font]
- [FONT='Courier New', Courier, monospace]*************************** 3. row ***************************[/font]
- [FONT='Courier New', Courier, monospace] id: NULL[/font]
- [FONT='Courier New', Courier, monospace] select_type: UNION RESULT[/font]
- [FONT='Courier New', Courier, monospace] TABLE: [/font]
- [FONT='Courier New', Courier, monospace] type: ALL[/font]
- [FONT='Courier New', Courier, monospace]possible_keys: NULL[/font]
- [FONT='Courier New', Courier, monospace] KEY: NULL[/font]
- [FONT='Courier New', Courier, monospace] key_len: NULL[/font]
- [FONT='Courier New', Courier, monospace] ref: NULL[/font]
- [FONT='Courier New', Courier, monospace] rows: NULL[/font]
- [FONT='Courier New', Courier, monospace] Extra: [/font]
- [FONT='Courier New', Courier, monospace]3 rows IN SET (0.00 sec) [/font]
Such query takes about 0.9 sec on the test data. Another possible optimizer improvement to do. P.S This post is inspired by Does MySQL Optimize UNION with LIMIT clause topic on our MySQL Forums. Possible optimization for sort_merge and UNION ORDER BY LIMIT - Read More... |