Results 1 to 27 of 27

Thread: JPEG 3000 Anyone ?

  1. #1
    Member
    Join Date
    May 2014
    Location
    Canada
    Posts
    136
    Thanks
    61
    Thanked 21 Times in 12 Posts

    JPEG 3000 Anyone ?

    https://jpeg.org/downloads/htj2k/wg1..._final_cfp.pdf

    Anyone interested in working on a faster entropy coder for JPEG 2000 ?
    Deadline for registration of interest is October 1.

  2. #2
    Member Alexander Rhatushnyak's Avatar
    Join Date
    Oct 2007
    Location
    Canada
    Posts
    232
    Thanks
    38
    Thanked 80 Times in 43 Posts
    26 pages...

    "On a given platform and with identical decoded image, the HTJ2K codestream should be on average no more than 15% larger than the corresponding JPEG 2000 Part 1 codestream."
    "Over a range of bitrates and on a given software platform, the throughput of the HTJ2K block decoder should be on average no less than 10 times greater than the JPEG 2000 Part 1 block decoder of the reference specified in Annex D. Increase of throughput of the HTJ2K block decoder is also desirable on hardware and GPU platforms."
    However, in section B.6: "Assembly language or GPU code shall not be included". I bet JPEG 4000 will ask for GPU source code

    And btw, "The submission shall include source code to serve as a verification model, written in a high-level language, such as C or C++" -- I think nowadays C is a relatively low-level language.

    This newsgroup is dedicated to image compression:
    http://linkedin.com/groups/Image-Compression-3363256

  3. #3
    Member
    Join Date
    Apr 2012
    Location
    Stuttgart
    Posts
    437
    Thanks
    1
    Thanked 96 Times in 57 Posts
    Quote Originally Posted by boxerab View Post
    https://jpeg.org/downloads/htj2k/wg1..._final_cfp.pdf Anyone interested in working on a faster entropy coder for JPEG 2000 ? Deadline for registration of interest is October 1.
    This call resulted in the outcome of the JPEG XS CfP evaluation. We also had a faster JPEG 2000 entropy coder as one candidate (FBCOT) for XS, of course coming from the UNSW (David Taubman). The call did not enter the JPEG XS standardization basically because it seemed likely that an FPGA implementation of the call does would not fit into the target architecture, so we split this part off. It seems that it will become part-15 of JPEG 2000.

    What is currently on the table is a combination of MelCode (as in JPEG LS) with RLE-coding, therre is an SPIE paper by David Taubman on this. The source for the speedup is that the updated entropy coder no longer operates on bitplanes or sub-bitplanes, but on groups of bitplanes, which should hopefully create a decent speedup. I also played with this idea approximately five years ago where I combined a "horizontal" (over coefficients) Huffman code with a "vertical" (over bitplanes) code to create a code that was almost as good as JPEG 2000 (though not scalable, same as Taubman's FBCOT proposal today) and quite a bit faster. There was unfortunately not enough momentum in the committee to continue with this, but now we do.

    I can check whether I still find my paper on this, David will certainly provide his work - you'll find him at the UNSW pages.

    Concerning the speedup: At least according to David's estimate, FBCOT can reach approximately 10 times the EBCOT speed, but note that this is *only* the speed of the bitplane coder (tier 1 of JPEG 2000), not the end-to-end speedup. How much that improves depends then of course on the rest of the code. With a pure CPU implementation in C++, this gives an approximate 2 to 3 times end-to-end speedup according to my estimate, but much more if you parallelize the wavelet, quantizer and multi-component decorrelation transformation.

    What we need by October 1st is people raising hands "hey, I would like to contribute to this", then have an optimized implementation (presumably in assembly) available in April next year which will be run in a testbench, i.e. only the entropy coder is needed, not the full JPEG 2000 pipeline.

    The call currently only addresses CPU as architecture. If you want to play with FPGA or GPU and faster coding, then JPEG XS is the right card to play. But there we already have an architecture and a test model (albeit slow, at this moment, as we play with a lot of ideas). Again, there were plenty of papers on this at this year's SPIE (Application of Digital Image Processing XL).

  4. #4
    Member
    Join Date
    Jun 2015
    Location
    Switzerland
    Posts
    667
    Thanks
    204
    Thanked 241 Times in 146 Posts
    Quote Originally Posted by thorfdbg View Post
    ... faster JPEG 2000 entropy coder ...
    PIK is an option where fast lossy photographic decoding is needed. It decodes on software — pik-to-rgb24 — 220 MB/s on a single core and 1.1 GB/s on a multi-core cpu, and gives really good compression densities for photographs (very likely better than the latest video codecs). Pik-to-rgb48 is ~35 % slower. PIK is probably about 5-10 % of the complexity of a modern video codec on key frame decoding. It is based on the relatively simple concepts such as entropy clustering, context modeling, psychovisual modeling, etc. that we have used in our earlier work with zopfli, WebP lossless, guetzli and brotli.

  5. #5
    Member
    Join Date
    May 2014
    Location
    Canada
    Posts
    136
    Thanks
    61
    Thanked 21 Times in 12 Posts
    Quote Originally Posted by Jyrki Alakuijala View Post
    PIK is an option where fast lossy photographic decoding is needed.
    Sounds interesting. Do you have any links to get more information on PIK ? For this particular use case, it has to be lossless, so PIK wouldn't work.

  6. #6
    Member
    Join Date
    May 2014
    Location
    Canada
    Posts
    136
    Thanks
    61
    Thanked 21 Times in 12 Posts
    Thanks for all of this background information, Thomas. Links to papers would be greatly appreciated! From what you've described so far, this sort of implementation may not work on a GPU.
    As GPUs have very limited on-chip memory, processing groups of bit planes could be in fact slower than processing one bit plane at a time. Unless the entropy LUT table is small and could
    be put in constant memory, and/or the flags buffer for the different code passes is small enough to fit in local memory. Good for CPU-based codecs, though.

    So, the 10x speedup requirement seems to refer to CPU-based implementations.

  7. #7
    Member
    Join Date
    May 2014
    Location
    Canada
    Posts
    136
    Thanks
    61
    Thanked 21 Times in 12 Posts
    Quote Originally Posted by Alexander Rhatushnyak View Post
    26 pages...

    "On a given platform and with identical decoded image, the HTJ2K codestream should be on average no more than 15% larger than the corresponding JPEG 2000 Part 1 codestream."
    "Over a range of bitrates and on a given software platform, the throughput of the HTJ2K block decoder should be on average no less than 10 times greater than the JPEG 2000 Part 1 block decoder of the reference specified in Annex D. Increase of throughput of the HTJ2K block decoder is also desirable on hardware and GPU platforms."
    However, in section B.6: "Assembly language or GPU code shall not be included". I bet JPEG 4000 will ask for GPU source code

    And btw, "The submission shall include source code to serve as a verification model, written in a high-level language, such as C or C++" -- I think nowadays C is a relatively low-level language.
    Yes, I agree, there should be more focus on GPU. What is interesting is that 8K and VR image size is going to make CPU implementations obsolete, just cannot keep up with GPU. So, this proposal would
    give the CPU codecs a few more years of relevance for this use-case.

  8. #8
    Member
    Join Date
    May 2014
    Location
    Canada
    Posts
    136
    Thanks
    61
    Thanked 21 Times in 12 Posts
    Thomas, can you comment on licensing vs. royalty-free for JPEG XS ? Always an issue for open source hackers like myself.

  9. #9
    Member
    Join Date
    Apr 2012
    Location
    Stuttgart
    Posts
    437
    Thanks
    1
    Thanked 96 Times in 57 Posts
    Quote Originally Posted by boxerab View Post
    Thanks for all of this background information, Thomas. Links to papers would be greatly appreciated!
    I'm attaching my SPIE 2012 paper which somehow got the ball rolling. For David's work, please check with him directly:

    https://www.engineering.unsw.edu.au/.../david-taubman

    The paper is SPIE copyright, though I can certainly provide my papers for others under fair use conditions. David can probably do the same.

    Quote Originally Posted by boxerab View Post
    From what you've described so far, this sort of implementation may not work on a GPU.
    That's certainly correct. I asked David the same question - they are currently not targeting GPUs. Looks to me a bit of a loss, but it seems that the market for JPEG 2000 applications does not require GPUs at this moment. JPEG XS is quite a bit different, here we have a very strong focus on GPU and FPGA, and less so for CPU.


    Quote Originally Posted by boxerab View Post
    As GPUs have very limited on-chip memory, processing groups of bit planes could be in fact slower than processing one bit plane at a time. Unless the entropy LUT table is small and could
    be put in constant memory, and/or the flags buffer for the different code passes is small enough to fit in local memory. Good for CPU-based codecs, though.
    JPEG XS only uses only unary codes for entropy coding (if you even want to call it like this). This can be parallelized on a GPU.


    Quote Originally Posted by boxerab View Post
    So, the 10x speedup requirement seems to refer to CPU-based implementations.
    Yes.
    Attached Files Attached Files

  10. #10
    Member
    Join Date
    Apr 2012
    Location
    Stuttgart
    Posts
    437
    Thanks
    1
    Thanked 96 Times in 57 Posts
    Quote Originally Posted by boxerab View Post
    Thomas, can you comment on licensing vs. royalty-free for JPEG XS ? Always an issue for open source hackers like myself.
    Well, two aspects. First, the ISO aspect: We as working group cannot make statements on royalties and licensing, we are technical experts, not legal experts. In particular, we are not allowed to select technology from licensing conditions.

    This being said, we can express "desires" what we would like to do, but we have no power to enforce it. As I say, "the right way, the wrong way, the ISO way".

    For High-Throughput JPEG 2000 (that is the official name), we have a desire to make this a royalty free standard. For JPEG XS, there is no such desire. The market - professional broadcasting applications - does not have problems with licensing, so it seems very likely that this standard includes IPs. The license costs in broadcasting are minor compared to the hardware costs and the savings you get from a mezzanine codec.

  11. #11
    Member
    Join Date
    May 2014
    Location
    Canada
    Posts
    136
    Thanks
    61
    Thanked 21 Times in 12 Posts
    Quote Originally Posted by thorfdbg View Post
    Well, two aspects. First, the ISO aspect: We as working group cannot make statements on royalties and licensing, we are technical experts, not legal experts. In particular, we are not allowed to select technology from licensing conditions.

    This being said, we can express "desires" what we would like to do, but we have no power to enforce it. As I say, "the right way, the wrong way, the ISO way".

    For High-Throughput JPEG 2000 (that is the official name), we have a desire to make this a royalty free standard. For JPEG XS, there is no such desire. The market - professional broadcasting applications - does not have problems with licensing, so it seems very likely that this standard includes IPs. The license costs in broadcasting are minor compared to the hardware costs and the savings you get from a mezzanine codec.
    Thanks, for JPEG XS I was hoping for an open solution, for example VC2.

  12. #12
    Member
    Join Date
    May 2014
    Location
    Canada
    Posts
    136
    Thanks
    61
    Thanked 21 Times in 12 Posts
    Thanks for the link. I will try to get a glimpse of David's SPIE paper, and see how it might work on the GPU, but I fear the tables for multiple bit planes will be too large. It would be interesting
    to try out various entropy coders, for example the ANS variants.

  13. #13
    Member
    Join Date
    Apr 2012
    Location
    Stuttgart
    Posts
    437
    Thanks
    1
    Thanked 96 Times in 57 Posts
    Quote Originally Posted by boxerab View Post
    Thanks for the link. I will try to get a glimpse of David's SPIE paper, and see how it might work on the GPU, but I fear the tables for multiple bit planes will be too large.
    The problem is decoding in parallel - this requires approaches that are particularly tuned towards the GPU architecture. However, GPUs are not in focus of HT-J2K.
    Quote Originally Posted by boxerab View Post
    It would be interesting to try out various entropy coders, for example the ANS variants.
    You are certainly more than welcome to contribute, though given the current understanding of complexity, ANS might be too much. We are talking about coding multiple bitplanes in less than a handful (probably 3-5) CPU cycles. There isn't really much you can do. Combine as many coefficients as possible, encode them in one single go. There isn't much more you can do than a table lookup or a single arithmetic operation plus a buffer write.

  14. #14
    Member
    Join Date
    May 2014
    Location
    Canada
    Posts
    136
    Thanks
    61
    Thanked 21 Times in 12 Posts
    Quote Originally Posted by thorfdbg View Post
    ... We are talking about coding multiple bitplanes in less than a handful (probably 3-5) CPU cycles. There isn't really much you can do. Combine as many coefficients as possible, encode them in one single go. There isn't much more you can do than a table lookup or a single arithmetic operation plus a buffer write.
    Thanks. How many kb were the tables that you used for your implementation ?

  15. #15
    Member
    Join Date
    Dec 2011
    Location
    Cambridge, UK
    Posts
    437
    Thanks
    137
    Thanked 152 Times in 100 Posts
    Quote Originally Posted by thorfdbg View Post
    The problem is decoding in parallel - this requires approaches that are particularly tuned towards the GPU architecture. However, GPUs are not in focus of HT-J2K. You are certainly more than welcome to contribute, though given the current understanding of complexity, ANS might be too much. We are talking about coding multiple bitplanes in less than a handful (probably 3-5) CPU cycles. There isn't really much you can do. Combine as many coefficients as possible, encode them in one single go. There isn't much more you can do than a table lookup or a single arithmetic operation plus a buffer write.
    That depends on how much SIMD you're willing to accept. With SIMD you can get huge speed ups if the algorithm permits it. With SIMD rANS decoding I think I was hitting a max of 1.4Gb/sec (on a 3.2GHz machine) and maybe 800-1000Mb/s for order-1 depending on data complexity. That was just entropy encoding though with no other data manipulation. Encoding was probably half that speed. Hardening the software would add a bit more overhead - since then I spotted a few issues in the frequency table handling.

  16. #16
    Member
    Join Date
    Apr 2012
    Location
    Stuttgart
    Posts
    437
    Thanks
    1
    Thanked 96 Times in 57 Posts
    Quote Originally Posted by JamesB View Post
    That depends on how much SIMD you're willing to accept.
    "Willing to accept" is funny. I (or rather, David) is willing to use as much SIMD as possible. (-: The question is "how much is possible". It is clearly necessary to use SIMD instructions to reach the envisioned speed-up. Everything else is rather simple to parallelize.

  17. #17
    Member
    Join Date
    May 2014
    Location
    Canada
    Posts
    136
    Thanks
    61
    Thanked 21 Times in 12 Posts
    Quote Originally Posted by thorfdbg View Post
    ... they are currently not targeting GPUs. Looks to me a bit of a loss, but it seems that the market for JPEG 2000 applications does not require GPUs at this moment.
    Vanilla JPEG 2000 already has a fast mode : RESTART + RESET + BYPASS, where each code pass is encoded independently, and non-skewed lower bit planes are encoded raw.

    Decoding such images would work really well on a GPU, as the on-chip memory requirements would be drastically reduced, so kernel occupancy would be very good.

  18. #18
    Member
    Join Date
    Apr 2012
    Location
    Stuttgart
    Posts
    437
    Thanks
    1
    Thanked 96 Times in 57 Posts
    Quote Originally Posted by boxerab View Post
    Vanilla JPEG 2000 already has a fast mode : RESTART + RESET + BYPASS, where each code pass is encoded independently, and non-skewed lower bit planes are encoded raw.
    Not really. You forget that even in the bypass mode, the cleanup phase is MQ-coded, and context-dependent, so you cannot decode bitplanes independent from each other. What you really *need* to do is to code (or decode) as many bitplanes as possible in a single go, i.e. decode multiple bits by a single operation. A binary decoder (as demonstrated in the EBCOT) will necessarily be slower.

  19. #19
    Member
    Join Date
    May 2014
    Location
    Canada
    Posts
    136
    Thanks
    61
    Thanked 21 Times in 12 Posts
    Quote Originally Posted by thorfdbg View Post
    Not really. You forget that even in the bypass mode, the cleanup phase is MQ-coded, and context-dependent, so you cannot decode bitplanes independent from each other.
    Well, not to belabor the point, but how about RESTART + RESET + BYPASS + TERMALL ? There you have each pass as a separate MQ segment, with no dependance on other passes.
    I am not sure how much coding efficiency suffers, though.

  20. #20
    Member
    Join Date
    Apr 2018
    Location
    Germany
    Posts
    4
    Thanks
    0
    Thanked 0 Times in 0 Posts
    So as I understood - for final solution for HTJ2K will be taken FBCOT from David Taubman.
    Then what are the sources of information to implement it - patent application from David Taubman?
    If there are somewhere reference source code available? openjpeg,jasper has nothing about it yet.

  21. #21
    Member
    Join Date
    May 2014
    Location
    Canada
    Posts
    136
    Thanks
    61
    Thanked 21 Times in 12 Posts
    I don't think the new standard has been finalized yet. But presumably, there will be a reference implementation from somebody.
    Also, I believe the committee wishes for a royalty-free standard, as with existing standard, so royalty-free license would be granted
    for any relevant IP.

  22. #22
    Member
    Join Date
    Apr 2018
    Location
    Germany
    Posts
    4
    Thanks
    0
    Thanked 0 Times in 0 Posts
    Quote Originally Posted by boxerab View Post
    I don't think the new standard has been finalized yet. But presumably, there will be a reference implementation from somebody.
    Also, I believe the committee wishes for a royalty-free standard, as with existing standard, so royalty-free license would be granted
    for any relevant IP.
    As i understand EBCOT is also patented by the same person but original J2K is royalty-free, that is why I hope it will be the same in case of FBCOT.

  23. #23
    Member
    Join Date
    May 2014
    Location
    Canada
    Posts
    136
    Thanks
    61
    Thanked 21 Times in 12 Posts
    The link at the top of this thread, in section 1.6, mentions that the goal of HTJ2K is to make it royalty-free.
    Thomas @thorfdbg would have the latest info on this.

  24. #24
    Member
    Join Date
    Apr 2018
    Location
    Germany
    Posts
    4
    Thanks
    0
    Thanked 0 Times in 0 Posts
    Quote Originally Posted by boxerab View Post
    The link at the top of this thread, in section 1.6, mentions that the goal of HTJ2K is to make it royalty-free.
    Thomas @thorfdbg would have the latest info on this.
    ?
    But royalty-free does not mean patent free. Some one may hold patent but not charge money for using it ? Or I am wrong?

  25. #25
    Member
    Join Date
    May 2014
    Location
    Canada
    Posts
    136
    Thanks
    61
    Thanked 21 Times in 12 Posts
    image compression is a heavily patented field. royalty-free means patent holders agree not to sue you if you use their algorithm , and they won't charge you any license fees.

  26. #26
    Member
    Join Date
    Apr 2018
    Location
    Germany
    Posts
    4
    Thanks
    0
    Thanked 0 Times in 0 Posts
    Quote Originally Posted by boxerab View Post
    image compression is a heavily patented field. royalty-free means patent holders agree not to sue you if you use their algorithm , and they won't charge you any license fees.
    Ok, thank you for the clarification.
    Do you know current status of this standard ? If other algorithms then FBCOT where submited?
    Do you think final standard will significantly deviate from the FBCOT ?

  27. #27
    Member
    Join Date
    Apr 2012
    Location
    Stuttgart
    Posts
    437
    Thanks
    1
    Thanked 96 Times in 57 Posts
    Quote Originally Posted by manifold View Post
    Ok, thank you for the clarification. Do you know current status of this standard ? If other algorithms then FBCOT where submited? Do you think final standard will significantly deviate from the FBCOT ?
    As far as I know, only two submissions have been made, and both from David Taubman. Other than that, I did not follow the HTJ2K activity much, being occupied by XS at the moment. But I will know more next week, as this will be our next JPEG meeting.

Similar Threads

  1. JPEG issues a draft call for a JPEG reference software
    By thorfdbg in forum Data Compression
    Replies: 11
    Last Post: 19th April 2017, 16:18
  2. JPEG XT Demo software available on jpeg.org
    By thorfdbg in forum Data Compression
    Replies: 40
    Last Post: 16th September 2015, 15:30
  3. News from JPEG
    By thorfdbg in forum Data Compression
    Replies: 0
    Last Post: 27th June 2015, 16:48
  4. Replies: 9
    Last Post: 11th June 2015, 23:28
  5. Replies: 0
    Last Post: 6th February 2015, 06:57

Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts
  •