Results 1 to 14 of 14

Thread: Small just-for-fun challenge at codegolf

  1. #1
    Programmer schnaader's Avatar
    Join Date
    May 2008
    Location
    Hessen, Germany
    Posts
    539
    Thanks
    192
    Thanked 174 Times in 81 Posts

    Small just-for-fun challenge at codegolf

    Encode an image inside of the source (@codegolf)

    I wanted to post it earlier, but wasn't sure if it was interesting at all, the image to be compressed is quite small (32x32) and I thought there isn't much room for clever things. But as it turns out, I'm at 670 chars now with my huffman in C attempt and just passed a more generic answer (zlib in Python).

    The image is somehow random, but there are only 10 different colors and they are grouped to small regions with single black pixels sprayed over the image.

    Arithmetic coding should give much better results here, but I'm not sure about the code overhead involved. (Adaptive) LZW would be another strong candidate to test.
    http://schnaader.info
    Damn kids. They're all alike.

  2. #2
    Member
    Join Date
    Jun 2009
    Location
    Kraków, Poland
    Posts
    1,471
    Thanks
    26
    Thanked 120 Times in 94 Posts
    A little suggestion:

    Instead of transforming from 37-100 to 0-63 you can transform to 64-127. Then you can remove the variable holding the bit index. You would just shift the number to right by one bit at a time until the value is equal to 1 and then you would read next symbol.

    I don't know if that would decrease the size though.

    And convert:
    l=l*2+(j&k)/k
    to:
    l+=l+(j&k)/k

    I would also try performing MTF before Huffman.
    Last edited by Piotr Tarsa; 22nd March 2012 at 19:25.

  3. #3
    Administrator Shelwien's Avatar
    Join Date
    May 2008
    Location
    Kharkov, Ukraine
    Posts
    3,134
    Thanks
    179
    Thanked 921 Times in 469 Posts
    How about this:

    #include <stdio.h>
    int main( void ) {
    int c;
    FILE* g = fopen( "2.cpp", "wb" );
    FILE* h = fopen( "sample", "wb" );
    fprintf(g, "#include <stdio.h>\n");
    fprintf(g, "char s[]=\"");
    for( c=0; c<256; c++ ) {
    putc( c, h );
    switch(c) {
    case 0: fprintf(g,"\\x0"); break;
    case 10: fprintf(g,"\\n"); break;
    case 13: fprintf(g,"\\r"); break;
    case 26: fprintf(g,"\\x1A"); break;
    case 34: fprintf(g,"\\\""); break;
    case 92: fprintf(g,"\\\\"); break;
    default: fprintf(g, "%c", c );
    }
    }
    fprintf(g, "\";\n" );
    fprintf(g, "int main() { fwrite(s,1,sizeof(s)-1,fopen(\"out\",\"wb\")); }");
    }


    Its a little worse than real 8-bit coding, but should be still much better than base64.
    Also it seems to work with my 3 compilers (gcc/IC/MSC).

  4. #4
    Administrator Shelwien's Avatar
    Join Date
    May 2008
    Location
    Kharkov, Ukraine
    Posts
    3,134
    Thanks
    179
    Thanked 921 Times in 469 Posts
    As to the actual problem, it kinda looks like this - http://en.wikipedia.org/wiki/Four_color_theorem
    Maybe the original intention was something like encoding some boundary mask, then flood-filling the shapes.
    But the small size of the picture makes it much less interesting - likely a bit RLE with elias codes or something similar
    would win here too.

  5. #5
    Programmer schnaader's Avatar
    Join Date
    May 2008
    Location
    Hessen, Germany
    Posts
    539
    Thanks
    192
    Thanked 174 Times in 81 Posts
    Quote Originally Posted by Piotr Tarsa View Post
    And convert:
    l=l*2+(j&k)/k
    to:
    l+=l+(j&k)/k
    Thanks, although I already did exactly that change just minutes ago

    Quote Originally Posted by Shelwien View Post
    How about this:
    [...]
    Hm... there are some more control characters I'd convert (e.g. bell and TAB), but it should still be much better, yes, saving ~80-90 characters.
    EDIT: Just tried it using 7bit (37-164) and although it compiles and runs, copy/paste runs into problems and depends too much on the browser/editor used.
    Last edited by schnaader; 22nd March 2012 at 20:47.
    http://schnaader.info
    Damn kids. They're all alike.

  6. #6
    Member
    Join Date
    Dec 2011
    Location
    Cambridge, UK
    Posts
    437
    Thanks
    137
    Thanked 152 Times in 100 Posts
    A more standard version of k?:expr is k||expr. Similarly k?expr: (is that a gcc extension too?) is k&&expr.

  7. #7
    Programmer schnaader's Avatar
    Join Date
    May 2008
    Location
    Hessen, Germany
    Posts
    539
    Thanks
    192
    Thanked 174 Times in 81 Posts
    Quote Originally Posted by Piotr Tarsa View Post
    Instead of transforming from 37-100 to 0-63 you can transform to 64-127. Then you can remove the variable holding the bit index. You would just shift the number to right by one bit at a time until the value is equal to 1 and then you would read next symbol.
    Thanks, saved 6 chars

    Quote Originally Posted by JamesB View Post
    A more standard version of k?:expr is k||expr. Similarly k?expr: (is that a gcc extension too?) is k&&expr.
    Thanks, works like a charm.
    http://schnaader.info
    Damn kids. They're all alike.

  8. #8
    Member caveman's Avatar
    Join Date
    Jul 2009
    Location
    Strasbourg, France
    Posts
    190
    Thanks
    8
    Thanked 62 Times in 33 Posts
    Quote Originally Posted by schnaader View Post
    The image is somehow random, but there are only 10 different colors and they are grouped to small regions with single black pixels sprayed over the image.
    You could perhaps reduce the colors to 9 (since brown appears only after purple).
    Locate the spot of the last purple pixel (31,1, once decoding has crossed it replace the purple color by brown).

    remove the brown color from:
    p[]={0,5390379,6379596,7783255,3435360,16354410,51318 94,5295994,4671418,9975021};
    Recompute Huffman codes for 9 colors + special repeat, using a picture similar to the one attached (brown replaced by purple -> 9 colors)

    for(m=0;m<10;m++){ // only 10 huffman codes
    if(l==h[m]){
    if (y>1 p[purple index]=brown color;
    Attached Images Attached Images  
    Last edited by caveman; 23rd March 2012 at 03:02.

  9. #9
    Administrator Shelwien's Avatar
    Join Date
    May 2008
    Location
    Kharkov, Ukraine
    Posts
    3,134
    Thanks
    179
    Thanked 921 Times in 469 Posts
    My version, with a rangecoder - http://nishi.dreamhosters.com/u/rgb_00.rar, 860 chars
    base-96 version compresses the image to 310 bytes,
    base-256 version produces 256 bytes.
    bmf's result is 230 (w/o header), so 256 should be fairly ok.
    I actually tried tuning a linear counter for it and the improvement was too small (like maybe 5 bytes).

    Also another version, with base-92 unary mtf coding - http://nishi.dreamhosters.com/u/rgb_01.rar, 731 chars

    Python script winning it with built-in zlib and base64 is no fun though.

    And another one with 648 chars - http://nishi.dreamhosters.com/u/rgb_02.rar
    Maybe I should try switching to C, these automatic types are surely convenient.
    @schnaader: Look at my color table, there's stuff you can borrow too.

  10. #10
    Programmer schnaader's Avatar
    Join Date
    May 2008
    Location
    Hessen, Germany
    Posts
    539
    Thanks
    192
    Thanked 174 Times in 81 Posts
    Well done, Shelwien! These are entries I had in mind when posting here, using more advanced algorithms.

    Quote Originally Posted by Shelwien View Post
    Python script winning it with built-in zlib and base64 is no fun though.
    Yes - the author didn't even follow my advice to use a special color (yet). Since the overhead is so small, he'll be at 400-500 characters total after it, which should be impossible to catch with even the best algorithms.

    Quote Originally Posted by Shelwien View Post
    @schnaader: Look at my color table, there's stuff you can borrow too.
    Just saw that. Very useful - this is a good alternative to hexadecimal representations that are pretty useless in C-like languages (except for big numbers) because the prefix is 2 bytes long.
    EDIT: Yeah, now I'm at 648, too By the way, if you can switch color order, GBR is 1 char shorter:
    Code:
    {0,9238890,4671418,'XaL',12809815,13586554,'@R+','k4`',3447021,'NNv'}
    {0,16354410,4671418,'aXL',7783255,5295994,'R@+','4k`',9975021,'NNv'}
    Quote Originally Posted by caveman View Post
    You could perhaps reduce the colors to 9 (since brown appears only after purple).
    I'll try this, nice catch!

    EDIT: Unfortunately, it only saves 9 chars in the base64 string and the changes cost much more (~17 chars for "if(y>1*p='NNv';").
    Last edited by schnaader; 24th March 2012 at 00:22.
    http://schnaader.info
    Damn kids. They're all alike.

  11. #11
    Administrator Shelwien's Avatar
    Join Date
    May 2008
    Location
    Kharkov, Ukraine
    Posts
    3,134
    Thanks
    179
    Thanked 921 Times in 469 Posts
    I also tried factoring the numbers (the idea was to reuse common parts, and/or find more numbers which could be written as literals),
    but it failed, because most of the colors include very large prime factors, so nothing to reuse.

  12. #12
    Administrator Shelwien's Avatar
    Join Date
    May 2008
    Location
    Kharkov, Ukraine
    Posts
    3,134
    Thanks
    179
    Thanked 921 Times in 469 Posts
    Thanks for the GBR tip, I used it, current results are 631(C++) and 613(C).
    Also tried literals with masking (like 'ab\377'), but it wasn't of any help.

    Found an interesting trick like this:

    while(j--){XXX;
    for(;j--;){XXX; // same length
    for(XXX;j--;){ // 1 char less


    Also X,Y arguments appeared very useful as loop counters etc
    and X+=Y*32+1 is shorter than j=Y*32+X+1
    Its also interesting that C++ accepts ++X+=Y*32, but C doesn't.

  13. #13
    Programmer schnaader's Avatar
    Join Date
    May 2008
    Location
    Hessen, Germany
    Posts
    539
    Thanks
    192
    Thanked 174 Times in 81 Posts
    The color palette could be seen as ~65 additional symbols (0-9,next color) that could be added to the compressed data, so the color palette would be decoded at runtime. It would save around 30 chars (with base64) and would cost a conditional to decode those first symbols different, something like:

    Code:
    (n<65)?{
      (symbol>9)?*pal++:*pal=*pal*10+symbol;
    }:{usual decoding}
    But at least for my code, I suppose it would skew the huffman distribution too much and I doubt it would save anything at all.
    http://schnaader.info
    Damn kids. They're all alike.

  14. #14
    Administrator Shelwien's Avatar
    Join Date
    May 2008
    Location
    Kharkov, Ukraine
    Posts
    3,134
    Thanks
    179
    Thanked 921 Times in 469 Posts
    My first implementation (the one with real rangecoder) had colors encoded along with other data.
    Current version has about 80 bytes of redundancy comparing to that, but appears that the extra decoding branch
    was even longer.

Similar Threads

  1. a challenge for C experts
    By Alexander Rhatushnyak in forum The Off-Topic Lounge
    Replies: 10
    Last Post: 25th January 2012, 05:29
  2. Calgary challenge
    By Matt Mahoney in forum Data Compression
    Replies: 6
    Last Post: 12th September 2010, 01:45
  3. compressing a really small 1k .COM file
    By Rugxulo in forum Data Compression
    Replies: 3
    Last Post: 28th November 2009, 01:32
  4. A Small Warning
    By encode in forum The Off-Topic Lounge
    Replies: 1
    Last Post: 30th August 2008, 21:05
  5. Fun video
    By encode in forum Forum Archive
    Replies: 7
    Last Post: 22nd October 2007, 22:26

Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts
  •