Page 1 of 3 123 LastLast
Results 1 to 30 of 72

Thread: Data Compression Tweets

  1. #1
    Member
    Join Date
    Mar 2013
    Location
    Worldwide
    Posts
    456
    Thanks
    46
    Thanked 164 Times in 118 Posts

    Data Compression Tweets

    "On Undetected Redundancy in the Burrows-Wheeler Transform" including comparisons of different compressors bcm/zpaq/bzip2
    Software: tbwt

  2. The Following 5 Users Say Thank You to dnd For This Useful Post:

    algorithm (6th April 2018),encode (11th April 2018),hexagone (6th April 2018),pothos2 (9th April 2018),Stephan Busch (6th April 2018)

  3. #2
    Tester
    Stephan Busch's Avatar
    Join Date
    May 2008
    Location
    Bremen, Germany
    Posts
    872
    Thanks
    457
    Thanked 175 Times in 85 Posts
    did you have success compiling those executables for windows?

  4. #3
    Member
    Join Date
    Dec 2011
    Location
    Cambridge, UK
    Posts
    437
    Thanks
    137
    Thanked 152 Times in 100 Posts
    I had issues with it finding a PRId32 definition in divsufsort.h, despite #include <inttypes.h>. I think this is the correct place to find this, but maybe it's a C vs C++ thing. (I'm a C programmer at heart.)

    I added a hideous workaround after the include of inttypes.h:

    Code:
    #ifndef PRId32
    #define PRId32 "d"
    #endif
    Ugly and *WRONG*, but it builds now on Linux.

    A check of enwik8:

    Code:
    bcm      20789671
    tbcm     20619312
    bwzip    24706581
    tbwzip   24641902
    wtzip    33235422
    twtzip   32618980
    Rather minimal on that data set.

    Edit: I can confirm their benchmark result on the nci file from silesa:

    bcm:1227060 (.293 bits/sym), tbcm:1155451 (.275)

    Where it works it's a significant win.

  5. #4
    Member
    Join Date
    Mar 2013
    Location
    Worldwide
    Posts
    456
    Thanks
    46
    Thanked 164 Times in 118 Posts
    I've shortly tried with MingW64, but its not possible to compile it under windows without other modifications than the PRId32.

  6. #5
    Member jibz's Avatar
    Join Date
    Jan 2015
    Location
    Denmark
    Posts
    114
    Thanks
    91
    Thanked 69 Times in 49 Posts
    To enable standard compliant printf with MinGW, try defining __USE_MINGW_ANSI_STDIO to 1.

  7. #6
    Member
    Join Date
    Jan 2017
    Location
    Germany
    Posts
    48
    Thanks
    25
    Thanked 10 Times in 7 Posts
    Quote Originally Posted by JamesB View Post
    A check of enwik8:

    Code:
    bcm      20789671
    tbcm     20619312
    bwzip    24706581
    tbwzip   24641902
    wtzip    33235422
    twtzip   32618980
    Rather minimal on that data set.

    Edit: I can confirm their benchmark result on the nci file from silesa:

    bcm:1227060 (.293 bits/sym), tbcm:1155451 (.275)

    Where it works it's a significant win.
    What about the speed of data encoding and decoding? How much effect has the extension of the BWT method on processing speed?

  8. #7
    Programmer michael maniscalco's Avatar
    Join Date
    Apr 2007
    Location
    Boston, Massachusetts, USA
    Posts
    109
    Thanks
    7
    Thanked 80 Times in 25 Posts
    This appears to be very similar to M03 but, if I understand the paper correctly, this is an intermediate transform whereas M03 encodes the BWT directly by exploiting the same observations only during the encoding process rather than as an seperate step.

    I believe that in the long run M03 will have the upperhand however since the encoding process able to encode the information with knowledge of the actual context of each symbol.

    I don't see any mention of time or memory overhead for this process but I could simply be missing it as I'm trying to read the pdf on a phone.

    Anyhow, good job to Uwe!

  9. #8

  10. The Following User Says Thank You to dnd For This Useful Post:

    Gotty (9th May 2018)

  11. #9

  12. #10

  13. The Following User Says Thank You to dnd For This Useful Post:

    Jarek (10th June 2018)

  14. #11
    Member
    Join Date
    Nov 2013
    Location
    Kraków, Poland
    Posts
    645
    Thanks
    205
    Thanked 196 Times in 119 Posts
    Oh, I didn't know it will go online ... its slides: https://www.dropbox.com/s/axji416fo8cm4u6/sfi.pdf

  15. The Following 2 Users Say Thank You to Jarek For This Useful Post:

    Cyan (10th June 2018),schnaader (27th June 2018)

  16. #12
    Member
    Join Date
    Mar 2013
    Location
    Worldwide
    Posts
    456
    Thanks
    46
    Thanked 164 Times in 118 Posts
    Compression method LZFSE

    Abstract:
    This thesis focuses on LZFSE compression method, which combines a dictionarycompression scheme with a technique based on ANS (asymmetric numeralsystems).
    It describes the principles on which the method works andanalyses the reference implementation of LZFSE by Eric Bainville.

  17. #13
    Member
    Join Date
    Mar 2013
    Location
    Worldwide
    Posts
    456
    Thanks
    46
    Thanked 164 Times in 118 Posts
    Data Compression Software Market Analysis

    According to Market Research Future data compression software Market The global data compression software market is expected to reach USD ~864 billion by the end of 2023 with ~7 % CAGR during forecast period 2018-2023

  18. #14

  19. #15
    Member
    Join Date
    Nov 2013
    Location
    Kraków, Poland
    Posts
    645
    Thanks
    205
    Thanked 196 Times in 119 Posts
    REPRESENTING IMAGES IN 200 BYTES: COMPRESSION VIA TRIANGULATION: https://arxiv.org/pdf/1809.02257.pdf

  20. The Following 3 Users Say Thank You to Jarek For This Useful Post:

    Cyan (13th September 2018),pothos2 (21st September 2018),sh0dan (28th April 2019)

  21. #16
    Member
    Join Date
    Nov 2011
    Location
    france
    Posts
    38
    Thanks
    2
    Thanked 26 Times in 18 Posts
    Quote Originally Posted by Jarek View Post
    REPRESENTING IMAGES IN 200 BYTES: COMPRESSION VIA TRIANGULATION: https://arxiv.org/pdf/1809.02257.pdf
    Will be presented in more details at ICIP 2018 next month (https://2018.ieeeicip.org/Papers/Pub...Sessionid=1160).

  22. The Following 2 Users Say Thank You to skal For This Useful Post:

    Cyan (13th September 2018),Jyrki Alakuijala (13th September 2018)

  23. #17
    Member
    Join Date
    Mar 2016
    Location
    USA
    Posts
    47
    Thanks
    5
    Thanked 22 Times in 14 Posts
    I'm curious if this technique could be improved by allowing overlapping triangles (with alpha) a la http://home.iitk.ac.in/~aniketmt/Pro...gonization.pdf

  24. #18
    Member
    Join Date
    Jun 2015
    Location
    Switzerland
    Posts
    667
    Thanks
    204
    Thanked 241 Times in 146 Posts
    Quote Originally Posted by skal View Post
    Will be presented in more details at ICIP 2018 next month (https://2018.ieeeicip.org/Papers/Pub...Sessionid=1160).
    How does it compare with https://code.fb.com/android/the-tech...review-photos/ ?

  25. #19
    Member
    Join Date
    Nov 2011
    Location
    france
    Posts
    38
    Thanks
    2
    Thanked 26 Times in 18 Posts
    Quote Originally Posted by MegaByte View Post
    I'm curious if this technique could be improved by allowing overlapping triangles (with alpha) a la http://home.iitk.ac.in/~aniketmt/Pro...gonization.pdf
    You might be interested by the "Primitive" project: https://primitive.lol/ (github)
    (they have a funny twitter feed too).

  26. #20
    Member
    Join Date
    Nov 2011
    Location
    france
    Posts
    38
    Thanks
    2
    Thanked 26 Times in 18 Posts
    Quote Originally Posted by Jyrki Alakuijala View Post
    We estimated that traditional block-based compression techniques start being competitive again around 400-500bytes, minus the up-scalability property of primitives. That's subjective, of course.

  27. #21
    Member
    Join Date
    Jun 2015
    Location
    Switzerland
    Posts
    667
    Thanks
    204
    Thanked 241 Times in 146 Posts
    Quote Originally Posted by skal View Post
    We estimated that traditional block-based compression techniques start being competitive again around 400-500bytes, minus the up-scalability property of primitives. That's subjective, of course.
    With jpeg header or without? Did you apply smoothing like Facebook did?

  28. #22
    Member
    Join Date
    Nov 2011
    Location
    france
    Posts
    38
    Thanks
    2
    Thanked 26 Times in 18 Posts
    Quote Originally Posted by Jyrki Alakuijala View Post
    With jpeg header or without? Did you apply smoothing like Facebook did?
    cf. paper, fig 5.

  29. #23
    Member
    Join Date
    Jun 2015
    Location
    Switzerland
    Posts
    667
    Thanks
    204
    Thanked 241 Times in 146 Posts
    Quote Originally Posted by skal View Post
    cf. paper, fig 5.
    Thank you! Facebook's original effort stripped the header and used smoothing. With header jpeg is not going to be competitive. Also, you need to try with jpeg yuv444. Yuv420 does not work when images are upsampled a lot.

  30. #24
    Member
    Join Date
    Nov 2011
    Location
    france
    Posts
    38
    Thanks
    2
    Thanked 26 Times in 18 Posts
    Quote Originally Posted by Jyrki Alakuijala View Post
    Also, you need to try with jpeg yuv444. Yuv420 does not work when images are upsampled a lot.
    That does not match my observations. Do you have an example?

  31. #25
    Member
    Join Date
    Jun 2015
    Location
    Switzerland
    Posts
    667
    Thanks
    204
    Thanked 241 Times in 146 Posts
    Quote Originally Posted by skal View Post
    That does not match my observations. Do you have an example?
    On my monitor and eyes particularly the reds look dark and lifeless in 420. Even more so if they are small and on green background.

    420

    Click image for larger version. 

Name:	g-yuv420.jpg 
Views:	43 
Size:	9.3 KB 
ID:	6201

    444

    Click image for larger version. 

Name:	g-yuv444.jpg 
Views:	45 
Size:	10.4 KB 
ID:	6202

  32. #26
    Member
    Join Date
    Nov 2011
    Location
    france
    Posts
    38
    Thanks
    2
    Thanked 26 Times in 18 Posts
    Quote Originally Posted by Jyrki Alakuijala View Post
    On my monitor and eyes particularly the reds look dark and lifeless in 420. Even more so if they are small and on green background.

    420

    Click image for larger version. 

Name:	g-yuv420.jpg 
Views:	43 
Size:	9.3 KB 
ID:	6201

    444

    Click image for larger version. 

Name:	g-yuv444.jpg 
Views:	45 
Size:	10.4 KB 
ID:	6202
    Bad RGB->YUV conversion. 'sjpeg -sharp ...' or 'cwebp -sharp_yuv ...' can handle these:


    Click image for larger version. 

Name:	g-yuv420-sharp.jpg 
Views:	30 
Size:	6.1 KB 
ID:	6208 yuv-420-sharp

    Note also that the paper uses colormap and vertices. No plane down-sampling is involved.

  33. #27
    Member
    Join Date
    Jun 2015
    Location
    Switzerland
    Posts
    667
    Thanks
    204
    Thanked 241 Times in 146 Posts
    Quote Originally Posted by skal View Post
    'cwebp -sharp_yuv ...'
    Yeah, that is a nice method that roughly halves the yuv420 error. I compared it against header stripped jpeg yuv444. Even though I wanted sharp yuv to win (I invented that yuv sharpening method for webp lossy), jpeg yuv444 was the winner in my testing for small previews.

  34. #28
    Member
    Join Date
    Nov 2013
    Location
    Kraków, Poland
    Posts
    645
    Thanks
    205
    Thanked 196 Times in 119 Posts
    Very interesting paper submitted to ICLR2019: https://openreview.net/pdf?id=ryE98iR5tm https://github.com/bits-back/bits-back
    "PRACTICAL LOSSLESS COMPRESSION WITH LATENT VARIABLES USING BITS BACK CODING"

    Surprisingly, they require LIFO entropy coder (changed to ANS from 1997's AC).
    Encoder goes backward and forward on the state:
    - first decode (!) latent (hidden) variable - with assumed probability distribution ("reversed entropy coding"),
    - then encodes the value using autoencoder from the latent variable,
    - then encodes the latent variable.

  35. The Following 2 Users Say Thank You to Jarek For This Useful Post:

    Cyan (8th October 2018),Shelwien (7th October 2018)

  36. #29

  37. The Following User Says Thank You to dnd For This Useful Post:

    algorithm (17th December 2018)

  38. #30
    Member
    Join Date
    Mar 2013
    Location
    Worldwide
    Posts
    456
    Thanks
    46
    Thanked 164 Times in 118 Posts
    PROGRAM Data Compression Conference (DCC 2019)

Page 1 of 3 123 LastLast

Similar Threads

  1. loseless data compression method for all digital data type
    By rarkyan in forum Data Compression
    Replies: 157
    Last Post: 9th July 2019, 17:28
  2. Data Compression PC
    By encode in forum The Off-Topic Lounge
    Replies: 202
    Last Post: 3rd January 2019, 23:28
  3. Next step in Data Compression
    By thometal in forum Data Compression
    Replies: 9
    Last Post: 9th August 2014, 04:15
  4. lossless data compression
    By SLS in forum Data Compression
    Replies: 21
    Last Post: 15th March 2011, 11:35
  5. Data Compression Crisis
    By encode in forum The Off-Topic Lounge
    Replies: 15
    Last Post: 24th May 2009, 19:30

Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts
  •