Results 1 to 8 of 8

Thread: Multi-way QuickSort

  1. #1
    Member
    Join Date
    Jun 2009
    Location
    Kraków, Poland
    Posts
    1,471
    Thanks
    26
    Thanked 120 Times in 94 Posts

    Multi-way QuickSort

    I'm currently developing a Dual-Pivot 5-way QuickSort (to be used in my QSufSort reimplementation).

    While sorting it will maintain following structure:

    lower than lower pivot | equal to lower pivot | between pivots | unsorted data | between pivots | equal to higher pivot | higher than higher pivot

    There are 7 sections, but as you see two sections have same meaning and one section is unsorted data, so those three should be counted as one. Finally, the structure will be:

    lower than lower pivot | equal to lower pivot | between pivots | equal to higher pivot | higher than higher pivot

    So there are 5 sections actually after finishing partitioning (so it's 5-way partitioning).


    In MSufSort sources, more precisely in changelog, I've seen a remark that Michael switched to 7-way QuickSort (AFAIK from 3-way one) and that gave him an immediate 20 % (or so) performance boost. How could he make 7-way in-place QuickSort? Or did he in fact made 5-way one?

  2. #2
    Member m^2's Avatar
    Join Date
    Sep 2008
    Location
    Ślůnsk, PL
    Posts
    1,612
    Thanks
    30
    Thanked 65 Times in 47 Posts
    3 pivots?

  3. #3
    Member
    Join Date
    Jun 2009
    Location
    Kraków, Poland
    Posts
    1,471
    Thanks
    26
    Thanked 120 Times in 94 Posts
    Well, one could extend the idea to use as many pivots as he wants, but the real problem is to actually improve sorting performance. The more pivots you have, the higher is the code complexity and more amount of work is needed to place one value in correct partition. I'm almost certain that adding pivots will degrade performance at some point - ie. there is an optimal number of pivots, yielding best average real performance. So I'm curious if he did use three pivots, how he partition data and if he takes unsorted values from both sides of unsorted section or just from one side (usually beginning). Anyway, I continue experimenting.

  4. #4
    Programmer michael maniscalco's Avatar
    Join Date
    Apr 2007
    Location
    Boston, Massachusetts, USA
    Posts
    109
    Thanks
    7
    Thanked 80 Times in 25 Posts
    Quote Originally Posted by Piotr Tarsa View Post
    In MSufSort sources, more precisely in changelog, I've seen a remark that Michael switched to 7-way QuickSort (AFAIK from 3-way one) and that gave him an immediate 20 % (or so) performance boost. How could he make 7-way in-place QuickSort? Or did he in fact made 5-way one?
    The quicksort in MSufSort is a 7-way. That is, there are three pivots. After the sort the array looks like as follows (A = pivot 1, B = pivot 2, C = pivot 3):

    less than A | equal to A | greater than A, less than B | equal to B | greater than B, less than C | equal to C | greater than C

    Consider the traditional three way sort which has three parts: less than pivot | equal to pivot | greater than pivot.
    All I did was to divide the partially sorted 'less than' and 'greater than' into three partitions containing a less than, equal to and greater than parts. The result is seven partitions with three of them equal to one of three pivots and four of them as partially sorted partitions relative to the pivots. This is really no different than a recursive 3-way which would be something like:

    void ThreeWayPartition(int * pData, int szData, int & szLessThan, int & szEqualTo, int & szGreaterThan)
    {
    // do traditional three way partition with one pivot. return size of three partitions.
    }


    int myData[N];
    int szLessThan, szGreaterThan;
    int szPartition[7];

    ThreeWayPartition(myData, N, szLessThan, szPartition[3], szGreaterThan);
    ThreeWayPartition(myData, szLessThan, szPartition[0], szPartition[1], szPartition[2]);
    ThreeWayPartition(myData + szLessThan + szEqualTo, szGreaterThan, szPartition[4], szPartition[5], szPartition[6]);


    I believe that the 7 way feature in MSufSort can be compiled in or out with the macro 'USE_SEVEN_WAY_QUICKSORT' if you want to see what the effect on performance is.

    Hope this helps.

    - Michael

  5. #5
    Member
    Join Date
    Jun 2009
    Location
    Kraków, Poland
    Posts
    1,471
    Thanks
    26
    Thanked 120 Times in 94 Posts
    So you're just doing a 3-way Partition, and then doing another 3-way Partitions on first and last partitions. Then why there are any improvements? I think it means the same amount of work. Maybe there's something nontrivial in MSufSort that this method helps, but in general, eg. when sorting array of integers I don't see the reason to improve.

    I am now experimenting heavily with dual-pivot QuickSorts. Generally, the results so far tells that there's no substantial improvement (if any) from 5-Way QuickSort over 3-Way QuickSort. But some code is still buggy, so I need to sort out some things before final judgement.

    BTW:
    Your site is down. When will it function again?

  6. #6
    Programmer michael maniscalco's Avatar
    Join Date
    Apr 2007
    Location
    Boston, Massachusetts, USA
    Posts
    109
    Thanks
    7
    Thanked 80 Times in 25 Posts
    Quote Originally Posted by Piotr Tarsa View Post
    So you're just doing a 3-way Partition, and then doing another 3-way Partitions on first and last partitions. Then why there are any improvements? I think it means the same amount of work. Maybe there's something nontrivial in MSufSort that this method helps, but in general, eg. when sorting array of integers I don't see the reason to improve.
    Basically, I think it was cache improvements. The machine I did that work on was quite old. Modern machines still show that the feature is an improvement over 3-way but not by much.

    quick tests:
    enwik8 3-way:26.237 7-way:25.491
    w3c2 3-way: 22.050 7-way:21.466

    Also, this is multikey quicksort. The speed up might not materialize for a simple quicksort. I'm not sure.

    Quote Originally Posted by Piotr Tarsa View Post
    BTW:
    Your site is down. When will it function again?
    I switched hosts a short while back and have not had time to post the site contents there. Never enough time.

    - Michael

  7. #7
    Member
    Join Date
    Jun 2009
    Location
    Kraków, Poland
    Posts
    1,471
    Thanks
    26
    Thanked 120 Times in 94 Posts
    Thanks for tests. It seems that your optimization is not really as good as I thought.

    From my tests, 3-way Lomuto partition is the fastest in general when it comes to sort array of ints. But when doing suffix sorting (or rotations sorting as I want) we compare not only indexes but also follow the indexes. 5-way Lomuto partition (I call it that, it's a variation that keeps stucture: lower than pivot1 | equal to pivot1 | between pivots | unsorted data | equal to pivot2 | higher than pivot2) lags only slightly in general (especially on Oracle Solaris Studio compiler suite; GCC has very non-determinant optimizations, sometime adding more computations to algorithm, that are even unrelated, speeds it up, and that's very illogical). But such 5-way partitioning greatly reduces number of memory accesses (at the cost of additional complexity), and that could be a win when it comes to suffix/ rotations sorting. As I'm making my own QSufSort variation, I'll test it up.
    Last edited by Piotr Tarsa; 3rd May 2011 at 02:42.

  8. #8
    Member
    Join Date
    Jun 2009
    Location
    Kraków, Poland
    Posts
    1,471
    Thanks
    26
    Thanked 120 Times in 94 Posts
    By the way:
    Dual pivot QuickSort turned out to be faster than old one in Java, when sorting arrays of primitives. In QSufSort things are different as there are arrays of array indexes, so more factors are important for speed, but that dual pivot Java QuickSort looks pretty well optimized.

    If anyone is interested, here is the link to discussion:
    http://permalink.gmane.org/gmane.com...ibs.devel/2628

Similar Threads

  1. pxz: Multi threaded xz compressor
    By polemon in forum Data Compression
    Replies: 0
    Last Post: 21st March 2011, 02:55
  2. Multi-threaded compression
    By Cyan in forum Data Compression
    Replies: 34
    Last Post: 16th January 2011, 18:32
  3. QArchiver - A new multi platform graphical Archiver
    By Simon Berger in forum Data Compression
    Replies: 25
    Last Post: 7th April 2010, 23:00
  4. Multi-Volume Archives
    By osmanturan in forum Data Compression
    Replies: 12
    Last Post: 13th June 2009, 01:46
  5. Multi-threading motivation
    By Trixter in forum Data Compression
    Replies: 1
    Last Post: 10th September 2008, 05:18

Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts
  •