Results 1 to 30 of 30

Thread: Win32 binary compressor with smallest native (EXE) footprint [not best-ratio-question

  1. #1
    Member
    Join Date
    Jan 2012
    Location
    Sopianae
    Posts
    28
    Thanks
    0
    Thanked 0 Times in 0 Posts

    Win32 binary compressor with smallest native (EXE) footprint [not best-ratio-question

    Nice to meet you again Encode.ru group!

    I just have an other question, in case there have been placed here so many compression and archiver programs/algorithms for a long time and there are so clever and helpful programers and top-coders, and I hope my question doesn't take so much space...
    Also, I'm still Googling on the net, but I haven't found yet, what I want.

    So the problem : I'm interested in, which is the compressor, that - NOT speaking about it's ratio - have ever been publicated with the smallest EXE/COM size?

    So I mean the COMPRESSOR-APP'S SIZE, not the compressed data's size.
    I think there sould be some special compressor application written in assembly somewhere... once a long time ago I found an article, which mentioned an assembly-written program, about 90-100 byte, that could find "zero-blocks", I mean parts in the file with more than two 0 next to each other, and write them down shorter.
    Anyway, I don't want such a small (and less efficient) compressor like that - but sounds interesting -, but an EXE size about some kbytes (under ten) would be perfect.

    Does anyone know (Win32 command-line) compressors like that ???

  2. #2
    Member
    Join Date
    May 2008
    Location
    Germany
    Posts
    260
    Thanks
    0
    Thanked 0 Times in 0 Posts
    * which is the compressor, that ... have ever been publicated with the smallest EXE/COM size?
    * an EXE size about some kbytes (under ten) would be perfect.

    my 2 cents:

    1. have a look at TarsaLZP (Piotr Tarsa) from http://asembler.republika.pl/bin/TarsaLZP.zip

    see also http://encode.ru/threads/856-TarsaLZP
    and http://cs.fit.edu/~mmahoney/compression/text.html#2153

    - encode.exe has 6144 bytes

    2. maybe the program lzss001 from encode can be optimized to fit the limit of 10240 bytes

    3. there is the "exe-packer" program upx witch can compress the executable files

    best regards

  3. #3
    Member
    Join Date
    Jan 2012
    Location
    Sopianae
    Posts
    28
    Thanks
    0
    Thanked 0 Times in 0 Posts
    Thanks the reply! Since I posted this question, I used Google too to find something. "Suprisingly" one of it's found was a BIG compression benchmark webside FROM Matt Mahoney - you know, this have been the best result yet.
    I also found one (or two) compresser about 3 KB, I think these were the best results related to native applications (not home built/linked source codes). But now I also would like to tell you a big suprise:

    When the Google moved me to that Compression-Benchmark side, the very first ZIP, which I downloaded was what you linked here now, 'TarsaLZP.zip'. Then I opened it and tried to unpack the Encoder & the Decoder. But then a strange think happened, every time, even with your uploaded zip : the Encoder simply disappeared, and when I tried to unzip the Decoder.exe, my computer slowed down for a while, and then it showed that the filesize is 0 byte, and when I tried to run, it showed 'Access Denied!' (naturally in hungarian-language).
    First I couldn't imagine, why it happened, which used to happen never yet. Then I thought, maybe it happens because of my strictly configured antivirus, but then I immediately remembered that I've switched all of my antivirus programs, because my memory couldn't keep them very well.
    Also, I don't think, that a virus have climbed onto my system because I had switched of my antiviruses, because I do a manual virus-searching every month and my machine haven't behaved in a strange way during any other process.

    Oh, and I haven't mentioned that I know about UPX and other EXE compressors, also once a time I tried to collect all the Executable-Compressors, because I wanted to know if UPX is the best _in ratio_ or not. But than my computer chrased sadly, and all of the downloaded articles, programs, sources and notes of the searching was destroyed. And several other things too....

  4. #4
    Expert
    Matt Mahoney's Avatar
    Join Date
    May 2008
    Location
    Melbourne, Florida, USA
    Posts
    2,420
    Thanks
    99
    Thanked 311 Times in 193 Posts
    If you count ZPAQL as a programming language, then there are many decompressers that are around 100 bytes. For example, the 114 byte archive http://mattmahoney.net/dc/pi.txt.zpaq decompresses with zpaq to the 1 MB file pi.txt from the Canterbury corpus. It contains its own "decompression" program that calculates pi to 1,000,000 digits. (Be patient. It may take about a day to run).

    If you are looking for x86 code, then the compressing linker Crinkler, loosely based on PAQ6, produces self decompressing 4KB demos with about 220 bytes of x86 code. http://www.geeks3d.com/20101229/demo...xe-compressor/

  5. #5
    Member
    Join Date
    May 2008
    Location
    Germany
    Posts
    260
    Thanks
    0
    Thanked 0 Times in 0 Posts
    1. i have tested the encode.exe and decode.exe from TarsaLZP.zip under winvista sp2 32bit
    and it works
    - it eats 50 % cpu-power
    - maybe sometimes the size of the compressed file is bigger then the size of the original
    But,
    if i compare the decompressed file with the original file - they are identically
    and this means for me it works

    2. the exe-packer crinkler seems very great! - but does not allways the resulting file is working

    best regards
    Last edited by joerg; 17th October 2012 at 20:13.

  6. #6
    Member
    Join Date
    Jan 2012
    Location
    Sopianae
    Posts
    28
    Thanks
    0
    Thanked 0 Times in 0 Posts
    "ZPAQL" - didn't you mean ARCHIVER instead of programming language? Anyway, I think the pi archiver is partly your joke, as barf too, even if you have to work with it and works well, isn't it? (Sorry, if you don't think the same)

    About EXE-compressors, I found a documentation about a program, called Dropper 2.0, which said to be an especially 4K intro compresser, moreover it's documentation speak about techniques and options, which sounds interesting for me, but I'm not a profession in this topic.
    As I remember, in decompression-module footprint FSG (Fast-Small-Good, maybe you know) is the best : it's decompressor and memory packer, with all others needed to depack and run the original executable is about 160 byte (I think 158, but maybe variable with versions between 1.0 and 2.0).

  7. #7
    Member m^2's Avatar
    Join Date
    Sep 2008
    Location
    Ślůnsk, PL
    Posts
    1,419
    Thanks
    6
    Thanked 26 Times in 20 Posts
    Decoder only and not exe but com, but:
    http://www.retroprogramming.com/2009...ay-with-1.html
    I believe cbloom wrote much smaller LZ decoder, it was in some comment on his blog, but I can't find it now.

  8. #8
    Member
    Join Date
    Jan 2012
    Location
    Sopianae
    Posts
    28
    Thanks
    0
    Thanked 0 Times in 0 Posts
    Thanks for you, square-mass too. Also I missed to mention that I search between data-file compressor, not between exe/com compressors. Despite perhaps the smallest will be an EXE-compressor, but now I don't have any idea how I will able to cut out the decompressor and rewrite to a file compressor in this case - I'm not an assembler-fan, if you understand what I mean.

  9. #9
    Expert
    Matt Mahoney's Avatar
    Join Date
    May 2008
    Location
    Melbourne, Florida, USA
    Posts
    2,420
    Thanks
    99
    Thanked 311 Times in 193 Posts
    Yes, but ZPAQL is also a general purpose programming language.

    Code:
    (to print "hello world" paste this file into hello.cfg, then:
     zpaq -method hello -run pcomp nul:
    )
    comp 0 0 0 8 0
    hcomp
    pcomp copy ;
      a> 255 if
        a= 104 out
        a= 101 out
        a= 108 out out
        a= 111 out
        a= 32 out
        a= 119 out
        a= 111 out
        a= 114 out
        a= 108 out
        a= 100 out
        a= 10 out out
      endif
      halt
    end

  10. #10
    Member
    Join Date
    Jan 2012
    Location
    Sopianae
    Posts
    28
    Thanks
    0
    Thanked 0 Times in 0 Posts
    Oh, I see!!! Looks like a moderatable application, which has an own script world and can interpret them. Like a mod in a creative-game (like Stranded II.)

  11. #11
    Member
    Join Date
    May 2008
    Location
    England
    Posts
    309
    Thanks
    0
    Thanked 0 Times in 0 Posts
    I have/had somewhere around here a compressor/decompressor in 256 bytes, pure asm obviously. Dunno where the files are prolly lost to time but may be able to dig them up.

  12. #12
    Member
    Join Date
    Jan 2012
    Location
    Sopianae
    Posts
    28
    Thanks
    0
    Thanked 0 Times in 0 Posts
    Oh, my God, Intrinsic, that would be a nirvana!!!

  13. #13
    Member Jean-Marie Barone's Avatar
    Join Date
    Oct 2012
    Location
    Versailles, France
    Posts
    56
    Thanks
    14
    Thanked 1 Time in 1 Post
    LZ77 from Comrade may not be the best, but surely one of the smallest.
    LZMAT from Vitaly Evseenko is a bit longer, but more efficient.
    You may also have a look at aPLib from Jørgen Ibsen.
    Attached Files Attached Files

  14. #14
    Member
    Join Date
    Jan 2012
    Location
    Sopianae
    Posts
    28
    Thanks
    0
    Thanked 0 Times in 0 Posts
    Holy słt, I think I FOUND IT !!!

    It's called "Smile! ", by "Fal / FALinc", as I've thought was made as a DEMO and can be found on http://256bytes.untergrund.net/demo/
    Moreover, there is a version-pack where the encoder & decoder are separated from each other...

    COOL!!

    Then I'm just wonder if anyone could show me a smaller one! wahwah!
    Last edited by paqfan; 19th October 2012 at 02:07. Reason: overthinking... :)

  15. #15
    Member
    Join Date
    May 2008
    Location
    Germany
    Posts
    260
    Thanks
    0
    Thanked 0 Times in 0 Posts
    Very short programs!

    smile_e.com has 250 bytes
    smile_d.com has 207 bytes

    smile.com has 256 bytes

    Does these programs work under Windows Vista / Windows 7 or only under plain MS-DOS ?

    best regards

  16. #16
    Expert
    Matt Mahoney's Avatar
    Join Date
    May 2008
    Location
    Melbourne, Florida, USA
    Posts
    2,420
    Thanks
    99
    Thanked 311 Times in 193 Posts
    It's a DOS program. I assume it won't work in a 64 bit OS. It does run in 32 bit Vista.

    Code:
    1,000,000 ENWIK6
      708,233 ENWIK6.256  (smile -e / smile -d)
      708,232 ENWIK6.SMI  (smile_e / smile_d)
    Can anyone figure out what algorithm it uses? There is ASM source code but it's not obvious.

  17. #17
    Member
    Join Date
    Jun 2009
    Location
    Kraków, Poland
    Posts
    1,027
    Thanks
    11
    Thanked 33 Times in 27 Posts
    I've studied 'smile 256' for a moment and it seems to me that it's a simple MTF followed by Elias gamma coding.

  18. #18
    Member
    Join Date
    Jan 2012
    Location
    Sopianae
    Posts
    28
    Thanks
    0
    Thanked 0 Times in 0 Posts
    Did you studied 'smile 256', really?! even you know the compression-algorithm-type!...

    Oh, Please, dear Piotr, could you post, or send me your studies about this smart asm code!!!
    This morning I printed out all of the three code and I tried to analyse or just try to understand it partly between my lessons in the long breaks, but first I must say that I'm a very lay in assembly, and so many questions turned into my mind about this interesting code.

  19. #19
    Member
    Join Date
    Jun 2009
    Location
    Kraków, Poland
    Posts
    1,027
    Thanks
    11
    Thanked 33 Times in 27 Posts
    MTF is described here: http://en.wikipedia.org/wiki/Move-to-front_transform
    Gamma coding is described here: http://en.wikipedia.org/wiki/Elias_gamma_coding

    Although the program uses a slightly another variation of gamma coding, ie instead of outputting N-zeros, then 1, then N-bits of number, it outputs a pair of (0, bit) for every bit in the input number and appends a 1 at the end. Ie, instead of coding 0001xxx it codes 0x0x0x1.

    Gamma coding happens in getvalue and putvalue functions, MTF table initialization is in TableIni loop (although a 64 KiB table is written, only a 256 byte chunk of it is used). Updating of MTF table (ie shifting a part of an array) is in update_table procedure.

    There are lots of size optimizations, for example:
    Code:
    	; - search for symbol
    	xchg	ch,cl
    Which effectively does: cx := 256, because before that instruction cx == 1 always.

    End of file is marked by (gamma) encoding a value of 256. After that encoder outputs value 511, probably as a replacement to state flush.
    Last edited by Piotr Tarsa; 19th October 2012 at 23:34.

  20. #20
    Member
    Join Date
    Jan 2012
    Location
    Sopianae
    Posts
    28
    Thanks
    0
    Thanked 0 Times in 0 Posts
    Hmmm, thanks, really!!!
    Anyway, in that case if you - perhaps - done some comment on it, would you be able to show some of them.... I... I just would like to ask this - if it is not a problem and hard work -, to understand some of the asm codes, what do they do exactly, etc.

    Or have I search starting from google in websides related to assembly for these instructions ???

  21. #21
    Member
    Join Date
    Jan 2012
    Location
    Sopianae
    Posts
    28
    Thanks
    0
    Thanked 0 Times in 0 Posts
    Also "m^2"; posted a link pointed to an article about RLE COM/EXE compressors, and here the mentioned, and the source-showed decoders are just about 40-50 byte.
    So why an EXE/COM compresser/decoder algorithm is such smaller - for example - even than Smile256, or to be more precise smile_decoder too, even with the memory_moving code? Is it maybe becouse the algorithm is specially shorter in case decoding than encoding. Or is it becouse the compressed data, which is the original COM code&data is exatly in the same place and file-opening and character-reading isn't needed, or/as well the method is much simplex? Or WHY?
    Or are these radically different from each other (the posted RLE decoder and Smile_decoder) ???
    Last edited by paqfan; 20th October 2012 at 00:31. Reason: "Square Mass" as name is nondescript. =>

  22. #22
    Member
    Join Date
    Jun 2009
    Location
    Kraków, Poland
    Posts
    1,027
    Thanks
    11
    Thanked 33 Times in 27 Posts
    Well, I've done everything in my head, ie I didn't need any comments.

    If you do not understand assembly at all, then it will be rather hard to explain what happens without writing a LOT of comments.

    And, as I said, there are lot of size optimizations that heavily decrease readability.

    As to websites - I'm from Poland so I've learned a lot from Polish websites, but at the beginning I've started by downloading MASM32 from www.movsd.com and following Iczelion's tutorials. After that I learned 16-bit MS-DOS assembly.

    RLE encoding is, in simplest form, much simpler than both MTF transform and Elias gamma coding. Linked RLE decoder doesn't handle files also and that makes a big difference.
    Last edited by Piotr Tarsa; 20th October 2012 at 00:40.

  23. #23
    Expert
    Matt Mahoney's Avatar
    Join Date
    May 2008
    Location
    Melbourne, Florida, USA
    Posts
    2,420
    Thanks
    99
    Thanked 311 Times in 193 Posts
    Of course I had to benchmark this. http://mattmahoney.net/dc/text.html#6955

  24. #24
    Member
    Join Date
    May 2008
    Location
    England
    Posts
    309
    Thanks
    0
    Thanked 0 Times in 0 Posts
    Nice Smile was the one i was referring too, it even has my old comment there! ;p

  25. #25
    Member Karhunen's Avatar
    Join Date
    Dec 2011
    Location
    USA
    Posts
    86
    Thanks
    1
    Thanked 0 Times in 0 Posts
    A discussion on this couldn't be complete without the notable Fabrice BELLARD's LZEXE (dos) program, version 0.91 still works in a dos window on WIN7 32, which itself can be lzexe'd from 19104 bytes ( i assume its original fomat ) to 12208 bytes ( < 12k ) which is only 3 4k clusters. Important on a 40 meg HD ! ( circa 1990 )
    Here is the relevant wiki page, and in the "See Also" has a link to Kolmogorov complexity. That is what describes the true size of Matt's ZPAQL program above.
    Incidentally, I think its been mentioned here before, there is a competition for 1k,4k,64k demos. I do remember one of the competitions was an RLE encoder applied to a DOS exe.

  26. #26
    Member
    Join Date
    Jul 2011
    Location
    Spain
    Posts
    86
    Thanks
    3
    Thanked 2 Times in 1 Post
    Even if they was pure DOS, and not Win32, SFX modules for LHARC and LHA were quite small, in the range of 1,5 - 3 Kb, including the text literals that were outputed on screen.

  27. #27
    Member
    Join Date
    Oct 2009
    Location
    usa
    Posts
    30
    Thanks
    0
    Thanked 0 Times in 0 Posts
    For pure DOS there is also ESP v. 1.92 archiver, which compresses to 12,270 bytes. Its compression is as good as pkzip, and often better because it has delta (8 bit and 16 bit signed and unsigned audio) and image compression options.

  28. #28
    Member
    Join Date
    Jan 2012
    Location
    Sopianae
    Posts
    28
    Thanks
    0
    Thanked 0 Times in 0 Posts
    Dear nikkho, could you mention some name of these small SFX modules? Once I searched for compressors with small SFX-s, but the best result for me was WinRar 2 's Console-SFX with it's 20 KBs.

  29. #29
    Expert
    Matt Mahoney's Avatar
    Join Date
    May 2008
    Location
    Melbourne, Florida, USA
    Posts
    2,420
    Thanks
    99
    Thanked 311 Times in 193 Posts
    Many decompression programs are around 10 KB zipped. UPX should be about the same. I list sizes here, although often source code is smaller if it is available. http://mattmahoney.net/dc/text.html

  30. #30
    Member
    Join Date
    Jul 2011
    Location
    Spain
    Posts
    86
    Thanks
    3
    Thanked 2 Times in 1 Post
    Quote Originally Posted by paqfan View Post
    Dear nikkho, could you mention some name of these small SFX modules? Once I searched for compressors with small SFX-s, but the best result for me was WinRar 2 's Console-SFX with it's 20 KBs.
    SFX stands for SelF eXtracting archive (http://en.wikipedia.org/wiki/Self-extracting_archive) and is the generic naming for extractors being bundled with the archive.
    LHA and LHARC (http://en.wikipedia.org/wiki/LHA_(file_format)) were archivers originally created by Yoshi during 80s and 90s, sources are available in lots of places, as well as the original binaries.

Similar Threads

  1. hierarchical coding ratio
    By WillatSMU in forum Data Compression
    Replies: 4
    Last Post: 13th May 2012, 06:32
  2. Crook, a new binary PPM compressor
    By valdmann in forum Data Compression
    Replies: 25
    Last Post: 19th March 2012, 18:12
  3. Delete smallest file if not smaller then Xpercentage
    By SvenBent in forum Data Compression
    Replies: 2
    Last Post: 6th January 2009, 01:41
  4. LZTURBO 0.91 Parallel Compressor (Win32/Linux)
    By donotdisturb in forum Forum Archive
    Replies: 26
    Last Post: 19th April 2008, 21:15
  5. Replies: 3
    Last Post: 10th November 2007, 23:32

Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts
  •