Results 1 to 17 of 17

Thread: new O(nlogn) sorting algorithm?

  1. #1
    Member RichSelian's Avatar
    Join Date
    Aug 2011
    Location
    Shenzhen, China
    Posts
    156
    Thanks
    18
    Thanked 50 Times in 26 Posts

    Talking new O(nlogn) sorting algorithm?

    I was reviewing sorting algorithms these days, and came up a new one which i have not seen on any paper before:

    1. split input data into S=logn blocks, each block size T= n/logn (expect the last one)
    2. create min-heap on each block1 .. blockS
    3. for each element in block0:
    if it is bigger, swap it with smallest element in heaps
    heapify the modified heap
    4. now block0 conains max n/logn elements of input
    5. recursively sort block0
    6. process block1..blockS in the same way as (3,4,5)

    i have write an implementation on https://github.com/richox/sorting_al...tiheap_sort.hh
    it's slower than the original heap sort, but a bit faster than dijkstra's smooth sort.

  2. #2
    Member
    Join Date
    Jun 2009
    Location
    Kraków, Poland
    Posts
    1,471
    Thanks
    26
    Thanked 120 Times in 94 Posts
    Well, I don't see any theoretical gain there.

    Dijkstra's smooth sort was always pretty slow. Even for partially sorted input it's still pretty slow. A more widely used adaptive sort is Timsort - it finds already sorted or reverse-sorted subsequences and then proceeds with merge sort that has special handling for unbalanced merges (i.e. where the parts to be merged differ substantially in size). Its disadvantage is that it's not in-place (uses additional memory proportional to input size), but on the other hand it's stable (doesn't change order of elements with equal sorting keys).

    Timsort is used in Python's and Java's standard libraries as a generic sort (for specialized sorts Java uses a dual-pivot quicksort).

  3. #3
    Member
    Join Date
    Mar 2015
    Location
    Germany
    Posts
    57
    Thanks
    34
    Thanked 38 Times in 15 Posts
    You might be interessted in quickheapsort: http://www.sciencedirect.com/science...04397501002882

    I wanted to write an implementation for some time but I haven't found the time by now.

  4. #4
    Member RichSelian's Avatar
    Join Date
    Aug 2011
    Location
    Shenzhen, China
    Posts
    156
    Thanks
    18
    Thanked 50 Times in 26 Posts
    Quote Originally Posted by Piotr Tarsa View Post
    Well, I don't see any theoretical gain there.

    Dijkstra's smooth sort was always pretty slow. Even for partially sorted input it's still pretty slow. A more widely used adaptive sort is Timsort - it finds already sorted or reverse-sorted subsequences and then proceeds with merge sort that has special handling for unbalanced merges (i.e. where the parts to be merged differ substantially in size). Its disadvantage is that it's not in-place (uses additional memory proportional to input size), but on the other hand it's stable (doesn't change order of elements with equal sorting keys).

    Timsort is used in Python's and Java's standard libraries as a generic sort (for specialized sorts Java uses a dual-pivot quicksort).
    there's a new merge sort variant called Block Sort which is in-place, stable and has O(nlogn) worst time complexity:
    https://en.wikipedia.org/wiki/Block_sort
    https://github.com/BonzaiThePenguin/WikiSort a very complicated implementation

  5. The Following User Says Thank You to RichSelian For This Useful Post:

    Bulat Ziganshin (20th March 2017)

  6. #5
    Member RichSelian's Avatar
    Join Date
    Aug 2011
    Location
    Shenzhen, China
    Posts
    156
    Thanks
    18
    Thanked 50 Times in 26 Posts
    Quote Originally Posted by Christoph Diegelmann View Post
    You might be interessted in quickheapsort: http://www.sciencedirect.com/science...04397501002882

    I wanted to write an implementation for some time but I haven't found the time by now.
    i took a quick look at the paper. it is a quick sort variant which use external heap sort for the smaller partition. it didn't solve quicksort's worst time problem. and i don't think it can beat the original quicksort because heaps are cache-unfriendly and always slow.

  7. #6
    Member
    Join Date
    Feb 2013
    Location
    San Diego
    Posts
    1,057
    Thanks
    54
    Thanked 71 Times in 55 Posts
    It would be interesting to try to count the number of O(n lg n) sorting algorithms that are possible. I'm not sure how to approach the problem. It would be a combinatorics problem if it's easy; it would be a complexity theory problem if it's not easy.

  8. #7
    Member
    Join Date
    Feb 2013
    Location
    San Diego
    Posts
    1,057
    Thanks
    54
    Thanked 71 Times in 55 Posts
    I'm not sure that the algorithm in the OP is actually O(n lg n). I'm not sure I understand it 100%, but it looks like it might be O(n lg2 n).

  9. #8
    Member RichSelian's Avatar
    Join Date
    Aug 2011
    Location
    Shenzhen, China
    Posts
    156
    Thanks
    18
    Thanked 50 Times in 26 Posts
    Quote Originally Posted by nburns View Post
    I'm not sure that the algorithm in the OP is actually O(n lg n). I'm not sure I understand it 100%, but it looks like it might be O(n lg2 n).
    here is my provement:


    T(1) = O(1)
    T(n) =
    O(n) {initialize logn heaps with each size=n/logn} +
    n/logn * (
    O(logn) {select min element in logn heaps} +
    O(1) {swap} +
    O(logn) {heapify the max heap}
    ) +
    logn * T(n/logn) {recursive sort}


    = O(n) + logn*T(n/logn)
    >= O(n) + 2*T(n/2)
    = O(nlogn)

  10. #9
    Member
    Join Date
    Feb 2013
    Location
    San Diego
    Posts
    1,057
    Thanks
    54
    Thanked 71 Times in 55 Posts
    I think you are right. I guess I was mistaken.
    Last edited by nburns; 23rd March 2017 at 06:39.

  11. #10
    Member
    Join Date
    Feb 2013
    Location
    San Diego
    Posts
    1,057
    Thanks
    54
    Thanked 71 Times in 55 Posts
    I can't understand what the algorithm is trying to achieve. Why have lgn heaps instead of 1 heap?

    The reason it's slower than heapsort might be because it's n*(lgn + lgn). Of course that doesn't matter asymptotically, but it matters in reality.

  12. #11
    Member
    Join Date
    Feb 2013
    Location
    San Diego
    Posts
    1,057
    Thanks
    54
    Thanked 71 Times in 55 Posts
    It could be good for already-sorted or reverse-sorted data if you add a couple of optimizations. The first is to only scan the heaps once if the entire block is smaller. The second is to stay on the same heap as long as it keeps yielding the smallest element.

    Another thing that is helpful in most sort algorithms is to stop recursing when the size falls below a threshold and then use insertion sort.

  13. #12
    Member RichSelian's Avatar
    Join Date
    Aug 2011
    Location
    Shenzhen, China
    Posts
    156
    Thanks
    18
    Thanked 50 Times in 26 Posts
    yes this algorithm sucks in practical use (like smoothsort or library sort), i just found it interesting to think and implement

  14. #13
    Member
    Join Date
    Feb 2013
    Location
    San Diego
    Posts
    1,057
    Thanks
    54
    Thanked 71 Times in 55 Posts
    In your analysis, it seems like you glossed over the recursion. Doesn't it have a logarithmic number of levels of recursion, and doesn't that add another log factor?
    Last edited by nburns; 24th March 2017 at 02:55. Reason: Fixed terminology

  15. #14
    Member RichSelian's Avatar
    Join Date
    Aug 2011
    Location
    Shenzhen, China
    Posts
    156
    Thanks
    18
    Thanked 50 Times in 26 Posts
    this algorithm splits the input array into logn disjoint partitions int O(n) time and recursively sort each partition. it's a little like quick sort, but does not have O(n^2) problem.

  16. #15
    Member
    Join Date
    Feb 2013
    Location
    San Diego
    Posts
    1,057
    Thanks
    54
    Thanked 71 Times in 55 Posts
    Doesn't it split the input array into logn disjoint partitions in O(nlogn) time, then recursively sort all of the partitions making the time O(n(logn)^2)?

  17. #16
    Member
    Join Date
    Feb 2013
    Location
    San Diego
    Posts
    1,057
    Thanks
    54
    Thanked 71 Times in 55 Posts
    That would not be a tight bound, because the recursion is really complicated.

  18. #17
    Member RichSelian's Avatar
    Join Date
    Aug 2011
    Location
    Shenzhen, China
    Posts
    156
    Thanks
    18
    Thanked 50 Times in 26 Posts
    Quote Originally Posted by nburns View Post
    Doesn't it split the input array into logn disjoint partitions in O(nlogn) time, then recursively sort all of the partitions making the time O(n(logn)^2)?
    make-heap is O(n), so split logn partitions which size is n/logn can be done in O(n) time?

Similar Threads

  1. Even (somewhat) faster Suffix Sorting
    By Christoph Diegelmann in forum Data Compression
    Replies: 1
    Last Post: 19th September 2016, 20:52
  2. Sorting Algorithms Benchmark
    By Piotr Tarsa in forum The Off-Topic Lounge
    Replies: 39
    Last Post: 1st June 2015, 09:38
  3. Archon sorting algorithm
    By just a worm in forum Data Compression
    Replies: 2
    Last Post: 21st August 2013, 13:50
  4. Segmenting after block-sorting
    By Piotr Tarsa in forum Data Compression
    Replies: 1
    Last Post: 2nd September 2011, 02:27
  5. Little question about Suffix Sorting techniques
    By Piotr Tarsa in forum Data Compression
    Replies: 0
    Last Post: 23rd May 2011, 20:01

Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts
  •