Results 1 to 18 of 18

Thread: libbsc vs pigz

  1. #1
    Member
    Join Date
    Aug 2016
    Location
    India
    Posts
    36
    Thanks
    2
    Thanked 0 Times in 0 Posts

    libbsc vs pigz

    In compression,

    1 service at a time:

    libbsc (GPU)

    real 6.387s
    usr 1m1.776s
    sys 0m1.405s

    libbsc (multi threaded) (CPU)

    real 18.589s
    usr 3m18.894s
    sys 0m1.049s

    pigz

    real 9.095s
    usr 1m25.029s
    sys 0m0.640s

    2 services at a time:

    libbsc (GPU)

    real 11.537s
    usr 1m1.961s
    sys 0m1.483s

    libbsc (multi threaded) (CPU)

    real 35.325s
    usr 3m20.601s
    sys 0m1.025s

    pigz

    real 14.286s
    usr 1m24.494s
    sys 0m0.531s


    In decompression,

    libbsc (single threaded)

    real 44.129s
    user 43.660s
    sys 0.464s

    libbsc (multi threaded) (openmp)

    real 10.170s
    user 1m0.690s
    sys 0m0.459s

    libbsc (GPU)

    real 5.741s
    user 12.318s
    sys 0.474s

    pigz (multi threaded) (openmp)

    real 4.724s
    user 6.507s
    sys 0.023s

    My test file is 1GB enwik9 test file

    Even though the algorithms of libbsc and pigz are different, here it is mandatory to compare the time utilization. libbsc is running in GPU. Comparing to its single threaded and multi threaded versions, it is working faster. But while choosing between either libbsc (GPU) or pigz, I'm confused.

    pigz is running in CPU having Intel® Core™ i7-5600U CPU @ 2.60GHz × 4

    libbsc is running in GPU (Quadro M6000/PCIe/SSE2)

    Compared to pigz, actually libbsc has to run faster as GPU is having many cores than CPU, eventhough the algorithms are different. But it is reverse here

    GPU implementation of an algorithm is running slower than CPU implementation of another algorithm.

    Reason can be algorithms and algorithm implementation

    Now which algorithm can I choose?

    libbsc or pigz?

    Is there any other algorithms like libbsc which could work better than pigz? pigz is a parallel multi threaded implementation of gzip. Is there any GPU implementations for gzip? There are papers related to it. But is the implementations available? Will the GPU implementation of pigz could run within 4.724s?

    I need help in resolving it. Relevant answers will be very useful and helpful.

    Thanks in advance.
    Last edited by Vanns; 29th August 2016 at 11:42.

  2. #2
    Member
    Join Date
    Jun 2015
    Location
    Switzerland
    Posts
    667
    Thanks
    204
    Thanked 241 Times in 146 Posts
    Quote Originally Posted by Vanns View Post
    I need help in resolving it. Relevant answers will be very useful and helpful.
    If in doubt, go with the more conventional solution, i.e., pigz. Less surprise with graphics card driver updates causing a difference in compression results.

  3. #3
    Member
    Join Date
    Aug 2016
    Location
    India
    Posts
    36
    Thanks
    2
    Thanked 0 Times in 0 Posts

    libbsc vs pigz

    Quote Originally Posted by Jyrki Alakuijala View Post
    If in doubt, go with the more conventional solution, i.e., pigz. Less surprise with graphics card driver updates causing a difference in compression results.
    I need to clarify my doubt

    I am not having any problem with graphic driver purchase. I ll use graphic drivers.

    pigz(CPU) is over performing than libbsc(GPU). Or else is there any gzip implementations available for GPU?

  4. #4
    Programmer Bulat Ziganshin's Avatar
    Join Date
    Mar 2007
    Location
    Uzbekistan
    Posts
    4,497
    Thanks
    733
    Thanked 659 Times in 354 Posts
    libbsc is only practcal compressor using GPU for part of its work. overall, bsc has better compression than gzip on text files and perfroms more work for it, so it's slower

  5. #5
    Member
    Join Date
    Aug 2016
    Location
    India
    Posts
    36
    Thanks
    2
    Thanked 0 Times in 0 Posts

    libbsc vs pigz

    Quote Originally Posted by Bulat Ziganshin View Post
    libbsc is only practcal compressor using GPU for part of its work. overall, bsc has better compression than gzip on text files and perfroms more work for it, so it's slower
    Yes Compression efficiency is more in libbsc. But in terms of time pigz leads. I need speed as a main factor and it should run in GPU. Here pigz is leading but it is running in CPU only. Is there any implementations available to make it work on GPU. I mean GPU implementation of gzip?

  6. #6
    Programmer Bulat Ziganshin's Avatar
    Join Date
    Mar 2007
    Location
    Uzbekistan
    Posts
    4,497
    Thanks
    733
    Thanked 659 Times in 354 Posts
    no

  7. #7
    Member
    Join Date
    Aug 2016
    Location
    India
    Posts
    36
    Thanks
    2
    Thanked 0 Times in 0 Posts
    Quote Originally Posted by Bulat Ziganshin View Post
    no
    Is there any other library like libbsc to run in GPU?

  8. #8
    Programmer Bulat Ziganshin's Avatar
    Join Date
    Mar 2007
    Location
    Uzbekistan
    Posts
    4,497
    Thanks
    733
    Thanked 659 Times in 354 Posts
    Quote Originally Posted by Bulat Ziganshin View Post
    libbsc is only practcal compressor using GPU for part of its work

  9. #9
    Member
    Join Date
    Aug 2016
    Location
    India
    Posts
    36
    Thanks
    2
    Thanked 0 Times in 0 Posts
    Is the libbsc decompression also running in GPU?

  10. #10
    Programmer Bulat Ziganshin's Avatar
    Join Date
    Mar 2007
    Location
    Uzbekistan
    Posts
    4,497
    Thanks
    733
    Thanked 659 Times in 354 Posts
    no

  11. #11
    Member
    Join Date
    Dec 2011
    Location
    Cambridge, UK
    Posts
    437
    Thanks
    137
    Thanked 152 Times in 100 Posts
    There are FPGA implementations of Deflate, so perhaps that's the better avenue than buying GPUs if you're after speed.

    However we're also comparing apples with oranges. BSC is far smaller than Gzip too.

  12. #12
    Member
    Join Date
    Aug 2016
    Location
    India
    Posts
    36
    Thanks
    2
    Thanked 0 Times in 0 Posts

    Deflate in GPU

    Quote Originally Posted by JamesB View Post
    There are FPGA implementations of Deflate, so perhaps that's the better avenue than buying GPUs if you're after speed.

    However we're also comparing apples with oranges. BSC is far smaller than Gzip too.
    ya of course they are totally different. But anyway I have to choose either one

    smaller in the sense? compressed file size?

    Will it be feasible to implement Deflate in GPU rather than FPGA?

    I know there is a paper related to it. But no idea about its implementation. Can you help with the link of FPGA implementations of Deflate?

  13. #13
    Member
    Join Date
    Aug 2016
    Location
    India
    Posts
    36
    Thanks
    2
    Thanked 0 Times in 0 Posts
    Will it be feasible to implement Deflate in GPU rather than FPGA?

  14. #14
    Administrator Shelwien's Avatar
    Join Date
    May 2008
    Location
    Kharkov, Ukraine
    Posts
    3,134
    Thanks
    179
    Thanked 921 Times in 469 Posts
    Well, I actually tried to compile my deflate encoder for Intel GPU, using IntelC extensions.
    There were problems with code size and register allocation - I had to put noinline on everything and use size opt, and in the end it only loaded on newer GPU version which I don't have.
    But I don't see why it won't work, for, say, full-size radeon.
    Just, what would be the use case?
    Load a huge file and compress it with deflate -9?

  15. #15
    Member
    Join Date
    Dec 2011
    Location
    Cambridge, UK
    Posts
    437
    Thanks
    137
    Thanked 152 Times in 100 Posts
    The Wikipedia article has a range of links of hardware implementations of Deflate: https://en.wikipedia.org/wiki/DEFLATE

    Also recently I came across IBM's patches to some code I work with (Samtools) which has an interface to their hardware zlib: https://github.com/ibm-genwqe/genwqe-user

    Google finds a bunch of hits about deflate and gpu, but I haven't checked the state of affairs. Also trying searching for rfc 1951 and 1952, incase deflate or gzip doesn't find the right hits. I don't see why it's not feasible to implement, although it may be hard to get the higher end compression out of it. Basically you want a parallel way to find all the text matches as this is the slow part of gzip compression.

    Edit: In that regard, using a GPU suffix array or even the ST-8 construction used by BSC may work as an initial step to a deflate-in-gpu algorithm. (No one said you *have* to process the data as a stream.) I've no idea how much time this would save though.

  16. #16
    Member
    Join Date
    Aug 2016
    Location
    India
    Posts
    36
    Thanks
    2
    Thanked 0 Times in 0 Posts
    Quote Originally Posted by Shelwien View Post
    Well, I actually tried to compile my deflate encoder for Intel GPU, using IntelC extensions.
    There were problems with code size and register allocation - I had to put noinline on everything and use size opt, and in the end it only loaded on newer GPU version which I don't have.
    But I don't see why it won't work, for, say, full-size radeon.
    Just, what would be the use case?
    Load a huge file and compress it with deflate -9?
    My case is loading some 1GB file and compressing it with deflate.

    Deflate parallel multi threaded version is available as pigz.

    How about GPU implementation of it?

  17. #17
    Member
    Join Date
    Aug 2016
    Location
    India
    Posts
    36
    Thanks
    2
    Thanked 0 Times in 0 Posts
    Quote Originally Posted by JamesB View Post
    The Wikipedia article has a range of links of hardware implementations of Deflate: https://en.wikipedia.org/wiki/DEFLATE

    Also recently I came across IBM's patches to some code I work with (Samtools) which has an interface to their hardware zlib: https://github.com/ibm-genwqe/genwqe-user

    Google finds a bunch of hits about deflate and gpu, but I haven't checked the state of affairs. Also trying searching for rfc 1951 and 1952, incase deflate or gzip doesn't find the right hits. I don't see why it's not feasible to implement, although it may be hard to get the higher end compression out of it. Basically you want a parallel way to find all the text matches as this is the slow part of gzip compression.

    Edit: In that regard, using a GPU suffix array or even the ST-8 construction used by BSC may work as an initial step to a deflate-in-gpu algorithm. (No one said you *have* to process the data as a stream.) I've no idea how much time this would save though.
    Okay. Then its not feasible to implement gzip in GPU or gzip in parallel? Because already a parallel implementation of gzip is there (pigz)

  18. #18
    Member
    Join Date
    Aug 2016
    Location
    India
    Posts
    36
    Thanks
    2
    Thanked 0 Times in 0 Posts
    I know and I am familiar with Quick Sync technology and ffmpeg is using it in image and video processing side.

    Similarly Intel has Quick Assist technology for compressions (https://en.wikipedia.org/wiki/DEFLATE). Which product is using that technology (like ffmpeg)?

Similar Threads

  1. Replies: 2
    Last Post: 28th December 2017, 12:35
  2. Need help to migrate libbsc.com from Office Live to github
    By Gribok in forum The Off-Topic Lounge
    Replies: 3
    Last Post: 23rd April 2012, 01:29
  3. Parallel implmentation of gzip: pigz
    By nimdamsk in forum Forum Archive
    Replies: 2
    Last Post: 13th March 2007, 20:44

Tags for this Thread

Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts
  •