Page 2 of 4 FirstFirst 1234 LastLast
Results 31 to 60 of 93

Thread: Open source Kraken/Mermaid/Selkie/LZNA/BitKnit decompression

  1. #31
    Member
    Join Date
    Apr 2015
    Location
    Greece
    Posts
    68
    Thanks
    31
    Thanked 22 Times in 15 Posts
    The repository has been taken down.https://github.com/github/dmca/blob/...08-30-Oodle.md
    I don't think it was illegal since it is not covered by patents(in my opinion it can't because it is not something new,just engineering),no actual code has been stolen and the the copy of the oodle was obtained legally.
    Maybe github is just scared.Is there any mirror?

  2. #32
    Member
    Join Date
    Aug 2016
    Location
    Europe
    Posts
    19
    Thanks
    0
    Thanked 25 Times in 6 Posts
    > No, it is restricted under a specific closed license. The user obtained an illegal copy of our product
    > and then directly decompiled it, and slapped a GPL license on his copy and made it available. There
    > is no public version of our product available, and the author > admits he used an illegal copy of product.

    1) I did not obtain an illegal copy. One of the games I have (not pirated) contains the oo2core DLL file.

    2) Per European copyright laws, one is allowed to reverse engineer software that one legally has access to.
    More info here: http://www.bloomberg.com/news/articl...ineering-court
    “There is no copyright infringement” when a software company without access to a program’s source code “studied, observed and tested that program in order to reproduce its functionality in a second program,” the court said in a statement today.

    3) It was not "directly decompiled and made available". No code was copied from a decompiler to the GPL repository. Decompiler output is usually a mess and almost unreadable. Instead the code was carefully analyzed and C++ code was hand written to behave similarly. Just look at all the naming and structs and classes. None of that structure is available to a decompiler, it's all hand engineered.

    > No, the author has admitted in a public format that this is directly copied code:
    4) I have not admitted that this is copied code, because it is not.

  3. #33

  4. The Following User Says Thank You to powzix For This Useful Post:

    algorithm (1st September 2016)

  5. #34
    Member
    Join Date
    Dec 2011
    Location
    Cambridge, UK
    Posts
    437
    Thanks
    137
    Thanked 152 Times in 100 Posts
    I'm impressed with the analysis & reconstruction, and while it's legal within the EU, I can sort of understand why RAD tools asked for it to be removed and perhaps it is not legal in the location where the github servers reside. (RAD themselves are unlikely to have broken any laws by requesting it, even if it is incorrect in some points, so fair is fair in love and war! I half expected this too.)

    As an aside, I'm curious whether they have two versions of their library; one for reading and one for writing. Game designers may only have a need for decompression at customer side, with compression at the company end (for either client-server communication or just a one-off as part of distribution creation). Clearly some games will want compression at both ends, but not having the compressor dll around for most games would limit the opportunity for reverse engineering of that too.

    Of course the ideal response here is to improve on tools like ZSTD (which is already pretty awesome) and come up with a genuine open source implementation that uses some of the published ideas in Charles' and Ryg's blogs. They've both been remarkably open. I've been meaning to do a fully adaptive rANS codec rather than my static frequency ones for some time, but simply haven't found the time and it's not something I can justify doing in work time. That coupled to a decent LZ match finder that decodes multiple streams at once to avoid too many memory waits gets you most of the way there I think.

  6. #35
    Member
    Join Date
    Aug 2016
    Location
    Europe
    Posts
    19
    Thanks
    0
    Thanked 25 Times in 6 Posts
    The same library contains code both for compression and decompression. However, I'm more interested in the decompressor / data format which is why I only looked at that. The compressor is substantially larger and it would take a lot of time to understand that whole code base and I don't care too much about it. Once you know the data format it's relatively easy to write a compressor yourself (although hard to make it optimal).

  7. #36
    Administrator Shelwien's Avatar
    Join Date
    May 2008
    Location
    Kharkov, Ukraine
    Posts
    3,134
    Thanks
    179
    Thanked 921 Times in 469 Posts
    Btw, did you find a way to use the dll with file i/o? Or is it memory-to-memory only?
    For example, there're some obscure network functions which maybe could be used for that.

  8. #37
    Member
    Join Date
    Aug 2016
    Location
    Europe
    Posts
    19
    Thanks
    0
    Thanked 25 Times in 6 Posts
    I did not see anything related to File IO. Also didn't have a look at the networking stuff...

  9. #38
    Administrator Shelwien's Avatar
    Join Date
    May 2008
    Location
    Kharkov, Ukraine
    Posts
    3,134
    Thanks
    179
    Thanked 921 Times in 469 Posts
    I'm not talking about file i/o in the library itself, but using the dll to (de)compress files to files.
    It has to be some method for returning control from the processing function to the caller, and then resuming.
    Or maybe a callback which can replace the buffers.
    Network functions certainly should be able to do something like that, but it seemed to me that they
    work directly with tcp/udp packets, instead of plain chunks of data.

    Also (I tried mailing you about this, but it probably went to spam), do you know how to use kraken decoder in MT mode?

  10. #39
    Member
    Join Date
    Aug 2016
    Location
    Europe
    Posts
    19
    Thanks
    0
    Thanked 25 Times in 6 Posts
    In my code, in Kraken_Decompress, that decompresses a block at a time. You can change the code a little there, and it should be possible to do File IO. (I.e. you don't need to keep the whole source buffer in RAM). However, since match distances are more or less unbounded, I think it's inevitable that the whole destination buffer is kept in RAM.

    Sorry I did not see your email, it might have went to spam. I know the basics of the MT mode, but I did not look at it in detail. Basically, the way it works from an overview point of view, is as follows:

    1) Only two threads are supported.
    2) Thread A reads from the huffman stream and decodes everything into a KrakenLzTable (this happens in Kraken_ReadLzTable). As soon as it's done it's pushed to a queue and Thread B will use it. Thread A starts to decode another KrakenLzTable immediately while Thread B is processing the previous KrakenLzTable.
    3) Thread B waits for a KrakenLzTable to finish decoding. Then it calls Kraken_ProcessLzRuns to do all the match copying. Then it sleeps until Thread A has decoded another KrakenLzTable.

  11. #40
    Member
    Join Date
    Aug 2016
    Location
    Europe
    Posts
    19
    Thanks
    0
    Thanked 25 Times in 6 Posts
    If your question is how to use the actual Oodle DLL in various modes, then I don't know. The second to last argument to the OodleLZ_Decompress of the 2.2 DLL is a thread phase argument, that contains 1 when called from one thread and 2 when called from the other thread. But I'm not sure how to use the Oodle API to make the threads interact properly.

  12. #41
    Member
    Join Date
    Nov 2015
    Location
    ?l?nsk, PL
    Posts
    81
    Thanks
    9
    Thanked 13 Times in 11 Posts
    The EU legality isn't so simple.
    It's one thing to reverse engineer and another to implement in a way that doesn't breach copyright.
    The standard way is to have one person do the reverse engineering and another the implementation. It is not the only possible way, but probably the only legally-safe.
    In this case:
    * Jarek and powzix marvelled about some SIMD details, suggeting they were preserved intact
    * cbloom says "I recognize a lot of the code fragments, and they replicate a bunch of random implementation details exactly, in one case appearing to mention instruction selection choices made by VC++ that are different from what GCC does for the same source on Linux." which suggests the same

    So it clearly looks like a derivative work and not an independent creation. Therefore the copyright is held by both powzix and RAD and powzix can't legally exercise the right w/out RAD's permission.

    I find it funny to 'protect' illegal code with GPLv3.

  13. The Following User Says Thank You to m^3 For This Useful Post:

    mhajicek (2nd September 2016)

  14. #42
    Member
    Join Date
    Mar 2009
    Location
    Prague, CZ
    Posts
    60
    Thanks
    27
    Thanked 6 Times in 6 Posts
    Also, not everything that could be done, should be done, and even publicly. Honestly, this whole thread existence kinda makes whole encode.ru site losing some credit for me. I wonder why moderators almost never do their job here.....needed like once a year here, and just silence..... and not mentioning all those thank u under the first post....

    Seriously, guys, this is a rather small community existing around one specific topic, data compression. Wouldnt it be better to cooperate and respect each other?

  15. The Following User Says Thank You to mhajicek For This Useful Post:

    JamesB (2nd September 2016)

  16. #43
    Administrator Shelwien's Avatar
    Join Date
    May 2008
    Location
    Kharkov, Ukraine
    Posts
    3,134
    Thanks
    179
    Thanked 921 Times in 469 Posts
    We'd also have to remove it, once we get DMCA'd (even though technically .ru has no relation to US' DMCA).

    But I also think that we need to understand previous works to make some improvement.
    For example, SSE (=Secondary Symbol Estimation) which appreared in paq2 and made things move again after a few years,
    is a direct result of my decompiling of Shkarin's ppmonstr coder - we were stuck with that for 2-3 years before, despite
    Shkarin's attempts to explain it in words.

    And lzma is basically cabarc's LZX with rangecoder from my aridemo (compare the design if you don't believe it - http://nishi.dreamhosters.com/u/lzx0a.png )
    And lzna is a derivative work from lzma and rANS.
    And then you tell us to stop, and don't post derived sources in public...

    Another problem btw, is that most of people here don't really have access to research papers, so there's really no way around reverse-engineering in most cases.
    As to "losing some credit for me" - please go make another forum, all legal. We don't have much of these, even forum.compression.ru died.

  17. #44
    Member
    Join Date
    Mar 2009
    Location
    Prague, CZ
    Posts
    60
    Thanks
    27
    Thanked 6 Times in 6 Posts
    ok, so to make your long post short, i guess it would be like:
    1. sorry we dont care
    2. result is justified by the means used to get it

    thank u for explanation of where u stand, it is good to know

    eh.......of course sharing information is necessary for progress, i agree with that, and this forum itself is one of the ways to do it. But willingly shared information.....

    dont u think this can be discouraging for some authors?

  18. #45
    Administrator Shelwien's Avatar
    Join Date
    May 2008
    Location
    Kharkov, Ukraine
    Posts
    3,134
    Thanks
    179
    Thanked 921 Times in 469 Posts
    Didn't you notice that fgiesen didn't really say anything against this thread?
    He did say not to post links to his dll before, so I did remove it.
    Why do you protect somebody else's copyright, I wonder? Nothing better to do?

    > dont u think this can be discouraging for some authors?

    There're easily available solutions for software protection (encryption + anti-debug).
    Also my programs were reverse-engineered before too (reflate even multiple times) - do you think its better
    if sources are not posted, but somebody uses reverse-engineered code anyway?
    With sources people at least make an attempt to improve something, while without them people tend to
    just use the copyrighted binaries, but hide them with encryption.

    Also authors frequently can't post the sources (or even talk about algorithms in detail) due to contracts they signed.
    So a 3rd-party reverse-engineering can be a convenient workaround to get their work really appreciated by the community.
    And imho the enforcement of copyright is the problem of the copyright owner's legal department, not actual developer.
    Unfortunately it is all too easy these days to take down a thread or a site or whatever they want.

  19. #46
    Member
    Join Date
    Aug 2016
    Location
    Seattle
    Posts
    4
    Thanks
    0
    Thanked 10 Times in 3 Posts
    I don't think business stuff like this really belongs on encode.ru, so I'll just leave this note here.

    Our problems with the reversed-engineered clone are:

    1) The library was obtained illegally. There had been no games that had shipped with the Oodle DLLs at the time of the cloning (I don't even think anything has shipped yet, actually). We know when the version that was cloned was downloaded, so it's clear to us how and who leaked it. The person called up, pretended to be a customer, got the code, and leaked it - I think everyone can agree that's a crappy thing to do.

    2) The library was cloned, and then a GPL licensed was stuck on it. If there was a "this code is just for educational purposes, and was cloned from copy-written code" license, then we wouldn't have been as grumpy. But you can't just clone something and wrap it in a new license. Again, that's just a flat out shitty thing to do (and illegal in both the US and EU).

    3) The library is an attempt to copy the implementation itself, not just decode the data. There are ways to clean room reverse engineer an implementation, but that wasn't what this was - it's an attempt to get the same code sequences out of the compiler by staring at our assembly. There are an infinite number of ways to write the decoder, but this was a very specific and admitted attempt at just cloning the binary implementation. Again, that's just shitty - it serves no educational or instructional purpose other than trying to clone the code to wrap it in another license.

    Fabian is on a roadtrip, which is why he hasn't piped up, and Charles is just disappointed in encode.ru in general (first, this posting of the illegal DLL with a mocked up header and then this relicensed clone).

    This may be different to other people, but for me, encode.ru has always been a place for nitty-gritty technical discussion of compression topics - not warez. The entire reason no one from RAD even posted about these codecs here is that we just think they are very good implementations of existing stuff - that is, not anything new technically (as Fabian has said that multiple times, it's just another lzhuff). There are a few minor interesting technical details that we plan on talking about, but even they are mostly interesting only in a programming sense, not a theoretical compression sense.

    In any case, I think it is worth asking what kind of place encode.ru intends on being and what it's going to tolerate in the future. There aren't a lot of us doing cool stuff in compression - driving some of us away can't be a good thing in the long run.

    Again, this will be my only post about this, because business stuff is gross and boring.

  20. The Following 4 Users Say Thank You to jeffatrad For This Useful Post:

    Bulat Ziganshin (3rd September 2016),Razor12911 (12th October 2016),schnaader (3rd September 2016),Turtle (3rd September 2016)

  21. #47
    Administrator Shelwien's Avatar
    Join Date
    May 2008
    Location
    Kharkov, Ukraine
    Posts
    3,134
    Thanks
    179
    Thanked 921 Times in 469 Posts
    Frankly, I was suprised by somebody from RAD not asking to take it off immediately - but then fgiesen started posting in the thread without mentioning anything.
    My personal policy on reverse-engineering is like this: I respect authorship and would remove problematic content when explicitly asked to (and I still wasn't asked).
    But I don't like hypocrisy, and I do considered the reverse-engineered code (in general) to be beneficial for the community - so I won't protect anyone's copyright "in advance".
    Also, I'm not the owner of the forum, just one of admins, so my opinion is not "official" or anything - the other two are just busy atm, I guess.

    Btw, I think you're wrong about "The library was obtained illegally" thing. I've got my own copy of oodle 2.20 from a certain free-to-play game on 22-06-16,
    and this thread was started on 16-08-2016, so there was clearly enough time, especially taking into account the availability of decompiling tools for C++.
    In fact, I've got 2.30 on 02-08-2016 too, so even that was possible (though powzix seemed to not have it at first).

    > and Charles is just disappointed in encode.ru in general

    In fact, I'm also disappointed in people posting on the forum only to advertise their work.

  22. The Following User Says Thank You to Shelwien For This Useful Post:

    schnaader (3rd September 2016)

  23. #48
    Member
    Join Date
    Aug 2016
    Location
    Europe
    Posts
    19
    Thanks
    0
    Thanked 25 Times in 6 Posts
    Quote Originally Posted by jeffatrad View Post
    I don't think business stuff like this really belongs on encode.ru, so I'll just leave this note here.

    Our problems with the reversed-engineered clone are:

    1) The library was obtained illegally. There had been no games that had shipped with the Oodle DLLs at the time of the cloning (I don't even think anything has shipped yet, actually). We know when the version that was cloned was downloaded, so it's clear to us how and who leaked it. The person called up, pretended to be a customer, got the code, and leaked it - I think everyone can agree that's a crappy thing to do.
    Wrong.

    https://steamdb.info/depot/230411/

  24. #49
    Member
    Join Date
    Aug 2016
    Location
    Europe
    Posts
    19
    Thanks
    0
    Thanked 25 Times in 6 Posts
    Quote Originally Posted by jeffatrad View Post
    I don't think business stuff like this really belongs on encode.ru, so I'll just leave this note here.

    Our problems with the reversed-engineered clone are:

    3) The library is an attempt to copy the implementation itself, not just decode the data. There are ways to clean room reverse engineer an implementation, but that wasn't what this was - it's an attempt to get the same code sequences out of the compiler by staring at our assembly. There are an infinite number of ways to write the decoder, but this was a very specific and admitted attempt at just cloning the binary implementation. Again, that's just shitty - it serves no educational or instructional purpose other than trying to clone the code to wrap it in another license.
    Are you saying that if I modify the code to be less similar to oodle's implementation, you won't object? Actually, while I'm at it, I could write a compressor too, just for the fun of it. This discussion sparked my interest even more.

    Additionally, if you look at the code, you see that it differs quite a lot from Oodle's implementation. For example I don't use SIMD in the selkie/mermaid decompressor's core loop, I use a different layout of the huffman decode table in kraken. I don't bother with fuzz-safetyness, something that you do, at least for Kraken. I don't use a SIMD/BMI huffman decoder in kraken. If my work was just a copied clone, and my goal was to clone your implementation I would of course have cloned these details, since those are imo innovative things. My goal was just to spend a minimal amount of time to create a working decoder that's reasonably fast, to benefit the community. I don't have a financial interest in this, just research/educational purposes.

    Unlike yours, I wouldn't count my Kraken/Mermaid/Selkie code as production safe since it can overwrite random memory on invalid input, so what is it you're worried about - really?

    Stating that I spent time on getting VC to use a branchless core loop through CMOVs is not the same thing as admitting that I cloned your binary implementation.

    The reason why I picked GPL was actually to support you. GPL means that no game companies (i.e. your customers) would be able to use the code, as that would require their whole game to be open source. So you wouldn't lose any business. If I had picked BSD/MIT/Public Domain or whatever, it would have been worse for you.

  25. #50
    Member
    Join Date
    Nov 2015
    Location
    -
    Posts
    46
    Thanks
    202
    Thanked 10 Times in 9 Posts
    Quote Originally Posted by jeffatrad View Post
    1) The library was obtained illegally. There had been no games that had shipped with the Oodle DLLs at the time of the cloning (I don't even think anything has shipped yet, actually). We know when the version that was cloned was downloaded, so it's clear to us how and who leaked it. The person called up, pretended to be a customer, got the code, and leaked it - I think everyone can agree that's a crappy thing to do.
    https://warframe.com/landing выложили dll бесплатно!
    Бесплатно! Законно!

  26. #51
    Member
    Join Date
    Aug 2016
    Location
    Seattle
    Posts
    4
    Thanks
    0
    Thanked 10 Times in 3 Posts
    OK, if you obtained the DLL legally, then I have no complaint and I apologize.

  27. The Following User Says Thank You to jeffatrad For This Useful Post:

    schnaader (3rd September 2016)

  28. #52
    Tester
    Nania Francesco's Avatar
    Join Date
    May 2008
    Location
    Italy
    Posts
    1,565
    Thanks
    220
    Thanked 146 Times in 83 Posts
    Honestly I am opposed to reverse engineer software and yet I think that powzix has done nothing wrong. I think powzix indeed has shown great capacity of data compression programs programmer. Ask him to collaborate with me to my programs if you want!

  29. #53
    Member
    Join Date
    Apr 2015
    Location
    Greece
    Posts
    68
    Thanks
    31
    Thanked 22 Times in 15 Posts
    @powzix @Nania Francesco @anyone else interested
    What about a collaboration about a new open source fast lz huff compression algorithm better than Kraken(and zstd)?I would like to contribute.

  30. #54
    Tester
    Nania Francesco's Avatar
    Join Date
    May 2008
    Location
    Italy
    Posts
    1,565
    Thanks
    220
    Thanked 146 Times in 83 Posts
    Finally listening to the interesting words. For several years I try to convince a group of programmers to make our program you will define the shape (open or closed source)
    We can achieve results exceeding expectations

  31. The Following User Says Thank You to Nania Francesco For This Useful Post:

    Gonzalo (5th September 2016)

  32. #55
    Member
    Join Date
    Apr 2015
    Location
    Greece
    Posts
    68
    Thanks
    31
    Thanked 22 Times in 15 Posts
    I propose a program that it's first versions will be closed source and in the end open source (so as not to care about backwards compatibility as zstd does).I have started developing a huffman coder lzturbo,karken style which implements a fairly fast package merge algorithm for optimal limited length prefix codes.

  33. #56
    Tester
    Nania Francesco's Avatar
    Join Date
    May 2008
    Location
    Italy
    Posts
    1,565
    Thanks
    220
    Thanked 146 Times in 83 Posts
    I can provide my experience in data compression. As for multithreaded code are not very experienced. Since you already have an idea you could sketch out the base of the program (you have to decide whether to create a filing cabinet or a simple compressor or codec) to be implemented gradually. Various benchmark will help us understand if we're going good or bad. You will have to get used to my bad English!

    You must choose a name to the program!

  34. #57
    Member
    Join Date
    Apr 2015
    Location
    Greece
    Posts
    68
    Thanks
    31
    Thanked 22 Times in 15 Posts
    This is why i search for collaboration.It is a big project (not much free time)and i have not thought about it very much(maybe kraken style).Also some other people may help.

  35. #58
    Tester
    Nania Francesco's Avatar
    Join Date
    May 2008
    Location
    Italy
    Posts
    1,565
    Thanks
    220
    Thanked 146 Times in 83 Posts
    In the first phase it is necessary to find people who want to collaborate on this project. We can open a thread on how to set the program but I think only fast Huffman coding will not bring us great results of compression!

  36. #59
    Member
    Join Date
    Nov 2013
    Location
    Kraków, Poland
    Posts
    645
    Thanks
    205
    Thanked 196 Times in 119 Posts
    Suboptimality of Huffman strongly depends of the probability distribution, e.g. ZSTD uses Huffman for literals and FSE/tANS for everything else.
    The best would first looking at the probability distribution, then individually decide to use Huffman or something accurate - it requires just 1bit flag in decoder.
    Surprisingly, currently the fastest decoding accurate static EC is probably James' rANS(~660MB/s on i5-4570 CPU @ 3.20GHz):
    https://github.com/jkbonfield/rans_static

  37. #60
    Administrator Shelwien's Avatar
    Join Date
    May 2008
    Location
    Kharkov, Ukraine
    Posts
    3,134
    Thanks
    179
    Thanked 921 Times in 469 Posts
    > The best would first looking at the probability distribution, then individually decide to use Huffman or something accurate

    The reverse is also possible - like adjusting the decomposition of lens/distances to better fit the required probability distribution.

    Also maybe something can be done with block headers - I didn't look yet, so what zstd does for storing of the huffman lens or whatever it uses?
    I mean, its usually possible to apply much stronger compression method for header compression (and for any EC flushes).

Page 2 of 4 FirstFirst 1234 LastLast

Similar Threads

  1. Mermaid and Selkie join Kraken
    By SolidComp in forum Data Compression
    Replies: 29
    Last Post: 1st February 2019, 07:54
  2. Why not open source?
    By nemequ in forum Data Compression
    Replies: 65
    Last Post: 25th November 2013, 23:05
  3. MCM open source
    By Mat Chartier in forum Data Compression
    Replies: 12
    Last Post: 29th August 2013, 20:22
  4. Open source JPEG compressors
    By inikep in forum Data Compression
    Replies: 8
    Last Post: 22nd October 2011, 00:16
  5. PeaZip - open source archiver
    By squxe in forum Data Compression
    Replies: 1
    Last Post: 3rd December 2009, 22:01

Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts
  •