Revenue Source

Welcome to the Revenue Source affiliate marketing forums.

You are viewing our internet marketing and SEO forums as a guest which gives you limited access to most of our discussions.  By joining our free community, you will have access to post affiliate marketing topics, communicate privately with other members (PM), exchange SEO strategies, and access many other special features.  Registration is fast, simple and absolutely free so please, join our community today!

If you have any problems, please don't hesitate to contact us.

Go Back   Revenue Source > Site Design & Development > Databases
Reload this Page Using index for ORDER BY vs restricting number of rows.
Tags: , , , , ,

Reply
 
LinkBack Thread Tools Search this Thread
Old
  (#1 (permalink))
Affiliate Blogs is Offline
Revenue Source Veteran
Affiliate Blogs has a brilliant future here!
 
Affiliate Blogs's Avatar
 
Join Date: Oct 2005
Posts: 8,626
Jack of All Trades
CyberSpace United States
   
Using index for ORDER BY vs restricting number of rows. - 02-16-2007

One interesting problem with MySQL Optimizer I frequently run into is making poor decision when it comes to choosing between using index for ORDER BY or using index for restriction.
Consider we're running web site which sell goods, goods may be from different categories, different sellers different locations which can be filtered on, and there are also bunch of fields which sorting can be performed on such as seller, price, date added etc.
Such configuration often causes serious challenge choosing proper index configuration as it is hard to add all combinations of restrictions and order by to be fully indexed.
An extra problem comes from the fact MySQL prefers when it is possible to use index for further restriction and than using file sort, rather than using index for sorting and doing non-index based filtering for further restrictions. Here is example:

PLAIN TEXT
SQL:
  1. [FONT='Courier New', Courier, monospace]CREATE TABLE `goods` ([/font]
  2. [FONT='Courier New', Courier, monospace] `cat_id` int(10) UNSIGNED NOT NULL,[/font]
  3. [FONT='Courier New', Courier, monospace] `seller_id` int(10) UNSIGNED NOT NULL,[/font]
  4. [FONT='Courier New', Courier, monospace] `price` decimal(10,2) NOT NULL,[/font]
  5. [FONT='Courier New', Courier, monospace] KEY `cat_id` (`cat_id`,`price`),[/font]
  6. [FONT='Courier New', Courier, monospace] KEY `cat_id_2` (`cat_id`,`seller_id`[/font]
  7. [FONT='Courier New', Courier, monospace])[/font]
  8. [FONT='Courier New', Courier, monospace] [/font]
  9. [FONT='Courier New', Courier, monospace]mysql> EXPLAIN SELECT * FROM goods WHERE cat_id=5 AND seller_id=1 ORDER BY price DESC LIMIT 10 \G[/font]
  10. [FONT='Courier New', Courier, monospace]*************************** 1. row ***************************[/font]
  11. [FONT='Courier New', Courier, monospace] id: 1[/font]
  12. [FONT='Courier New', Courier, monospace] select_type: SIMPLE[/font]
  13. [FONT='Courier New', Courier, monospace] TABLE: goods[/font]
  14. [FONT='Courier New', Courier, monospace] type: ref[/font]
  15. [FONT='Courier New', Courier, monospace]possible_keys: cat_id,cat_id_2[/font]
  16. [FONT='Courier New', Courier, monospace] KEY: cat_id_2[/font]
  17. [FONT='Courier New', Courier, monospace] key_len: 8[/font]
  18. [FONT='Courier New', Courier, monospace] ref: const,const[/font]
  19. [FONT='Courier New', Courier, monospace] rows: 296338[/font]
  20. [FONT='Courier New', Courier, monospace] Extra: USING WHERE; USING filesort[/font]
  21. [FONT='Courier New', Courier, monospace]1 row IN SET (0.00 sec)[/font]
  22. [FONT='Courier New', Courier, monospace] [/font]
  23. [FONT='Courier New', Courier, monospace]mysql> EXPLAIN SELECT * FROM goods force INDEX(cat_id) WHERE cat_id=5 AND seller_id=1 ORDER BY price DESC LIMIT 10 \G[/font]
  24. [FONT='Courier New', Courier, monospace]*************************** 1. row ***************************[/font]
  25. [FONT='Courier New', Courier, monospace] id: 1[/font]
  26. [FONT='Courier New', Courier, monospace] select_type: SIMPLE[/font]
  27. [FONT='Courier New', Courier, monospace] TABLE: goods[/font]
  28. [FONT='Courier New', Courier, monospace] type: ref[/font]
  29. [FONT='Courier New', Courier, monospace]possible_keys: cat_id[/font]
  30. [FONT='Courier New', Courier, monospace] KEY: cat_id[/font]
  31. [FONT='Courier New', Courier, monospace] key_len: 4[/font]
  32. [FONT='Courier New', Courier, monospace] ref: const[/font]
  33. [FONT='Courier New', Courier, monospace] rows: 989171[/font]
  34. [FONT='Courier New', Courier, monospace] Extra: USING WHERE[/font]
  35. [FONT='Courier New', Courier, monospace]1 row IN SET (0.00 sec) [/font]



As you can see if given no hint MySQL will prefer to use index on (cat_id,seller_id) and sort all result set by price. This will be good choice if seller_id is selective, if it is not as in this case MySQL needs to sort a lot of rows to display only few.
If we force index as in second query explain will look scary with estimated million of rows to analyze but we got rid of filesort so MySQL can stop as soon as 10 rows are sent. In this case with seller_id being not really selective it is likely it will need to scan less than 100 rows to generate result.
The speed difference between these two example queries is about 100 times so it may be quite serious.
To fix this issue MySQL would need to better take into account column selectivity together with LIMIT range. If there are only few values for given seller_id (as it well can be skewed) using filesort is better as otherwise very large portion of index may need to be scanned to find 10 matching rows, if there are a lot of values of given seller_id, so it is badly selective using index scan is much better idea.
Until MySQL is able to handle this you will have to use force index hint.
The other problem you may have however is calculating count of matching rows which may be even trickier to slow for complex searches which generate a lot of rows.
Another interesting technique is to use sphinx search to accelerate sorting and retrieval which I should explain in details some time in the future.


Using index for ORDER BY vs restricting number of rows. - Read More...
  
Digg this Post!Add Post to del.icio.usBookmark Post in TechnoratiFurl this Post!
Reply With Quote
Reply


Thread Tools Search this Thread
Search this Thread:

Advanced Search

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

vB code is On
Smilies are On
[IMG] code is On
HTML code is Off
Trackbacks are On
Pingbacks are On
Refbacks are On

Similar Threads for: Using index for ORDER BY vs restricting number of rows.
Thread Thread Starter Forum Replies Last Post
Is there an Amazon Associates support number? trafficbroker Affiliate Marketing Q & A 1 01-10-2007 02:13 PM
Guess the Number Affiliate Blogs Programming Help 0 01-08-2007 03:27 PM
Lowest Number Affiliate Blogs Programming Help 0 11-14-2006 09:57 PM
The Highest Number Affiliate Blogs Programming Help 0 11-14-2006 09:57 PM
Relationship Between PageRank And The Number Of Backlinks? ValiantMarketer Goofing Around & Program Discussion 0 11-16-2004 05:51 PM



© 2004-6 RevenueSource.com.  All rights reserved.  Do not duplicate or redistribute in any form.
This website and its logos/design are property of RevenueSource.com.  All rights reserved. vBSEO 3.2.0 RC7


1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34