Revenue Source

Welcome to the Revenue Source affiliate marketing forums.

You are viewing our internet marketing and SEO forums as a guest which gives you limited access to most of our discussions.  By joining our free community, you will have access to post affiliate marketing topics, communicate privately with other members (PM), exchange SEO strategies, and access many other special features.  Registration is fast, simple and absolutely free so please, join our community today!

If you have any problems, please don't hesitate to contact us.

Go Back   Revenue Source > Site Design & Development > Databases
Reload this Page UNION vs UNION ALL Performance
Tags: ,

Reply
 
LinkBack Thread Tools Search this Thread
Old
  (#1 (permalink))
Affiliate Blogs is Offline
Revenue Source Veteran
Affiliate Blogs has a brilliant future here!
 
Affiliate Blogs's Avatar
 
Join Date: Oct 2005
Posts: 9,212
Jack of All Trades
CyberSpace United States
   
UNION vs UNION ALL Performance - 10-05-2007

When I was comparing performance of UNION vs MySQL 5.0 index merge algorithm Sinisa pointed out I should be using UNION ALL instead of simple UNION in my benchmarks, and he was right. Numbers would be different but it should not change general point of having optimization of moving LIMIT inside of union clause being cool thing.
But So is UNION ALL indeed faster than UNION DISTINCT (the UNION is shortcut for UNION DISTINCT) ?
Indeed it is. I did not have the same data as I used for the other test but I created similar test case - table with separate indexes on “a” and “b” columns with cardinality of 100, having about 40.000.000 of rows
[sql]
select * from test.abc where i=5 union select * from test.abc where j=5
[\sql]
This original query was taking about 22 seconds.
As I modified it:
[sql]
select * from test.abc where i=5 union all select * from test.abc where j=5 and i!=5
[\sql]
The query time dropped to about 6 seconds which is 3.5 times faster - quite considerable improvement.
As you can notice I added “i!=5″ clause - this is what allows us to ensure we do not have duplicate rows in result set matching both conditions and so result will be same as query with “i=5 or j=5″ where clause.
I also tried this original query (which uses index merge method in MySQL 5.0):
[sql]
select * from test.abc where i=5 or j=5
[\sql]
Such query takes 4 seconds so if you do not need to trick with order by and limit using index merge is faster than UNION as it indeed should be.
So why UNION ALL is faster than UNION DISTINCT ?
The first informed guess would be - because UNION ALL does not need to use temporary table to store result set, however this is not correct - both UNION ALL and UNION distinct use temporary table for result generation. Perhaps one more thing for Optimizer Team to look into.
Interesting enough the fact UNION and UNION ALL require temporary table can only be seen in SHOW STATUS - EXPLAIN does not want to tell you this shameful fact:
[sql]
mysql> explain (select * from test.abc where i=5) union all (select * from test.abc where j=5 and i!=5) \G
*************************** 1. row ***************************
id: 1
select_type: PRIMARY
table: abc
type: ref
possible_keys: i
key: i
key_len: 5
ref: const
rows: 348570
Extra: Using where
*************************** 2. row ***************************
id: 2
select_type: UNION
table: abc
type: ref
possible_keys: i,j
key: j
key_len: 5
ref: const
rows: 349169
Extra: Using where
*************************** 3. row ***************************
id: NULL
select_type: UNION RESULT
table:
type: ALL
possible_keys: NULL
key: NULL
key_len: NULL
ref: NULL
rows: NULL
Extra:
3 rows in set (0.00 sec)
[\sql]
In fact EXPLAIN output is the same for UNION and UNION ALL (which is too bad as execution for them is obviously different).
The difference in execution speed comes from the fact UNION requires internal temporary table with index (to skip duplicate rows) while UNION ALL will create table without such index.
This also explains why difference becomes larger when on disk table is required (as in this case) - Hash indexes used by MEMORY table are very efficient and do not give so much overhead.
Entry posted by peter | No comment
Add to: | | | |


UNION vs UNION ALL Performance - Read More...
  
Digg this Post!Add Post to del.icio.usBookmark Post in TechnoratiFurl this Post!
Reply With Quote
Reply


Thread Tools Search this Thread
Search this Thread:

Advanced Search

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

vB code is On
Smilies are On
[IMG] code is On
HTML code is Off
Trackbacks are On
Pingbacks are On
Refbacks are On

Similar Threads for: UNION vs UNION ALL Performance
Thread Thread Starter Forum Replies Last Post
Possible optimization for sort_merge and UNION ORDER BY LIMIT Affiliate Blogs Databases 0 09-18-2007 03:32 PM
Facebook A Union Issue? Affiliate Marketing News Internet Marketing Articles 0 08-30-2007 10:26 PM
Using LoadAvg for Performance Optimization Affiliate Blogs Databases 0 12-04-2006 01:05 PM
DevShed: Debugging and Performance Affiliate Blogs Programming Help 0 11-23-2006 06:57 PM
Do You Have The Performance Marketing Standard? Affiliate Blogs Affiliate Marketing 0 11-15-2006 01:27 PM



© 2004-6 RevenueSource.com.  All rights reserved.  Do not duplicate or redistribute in any form.
This website and its logos/design are property of RevenueSource.com.  All rights reserved. vBSEO 3.2.0 RC7


1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34