Results 1 to 8 of 8

Thread: Help to detect unpacker

  1. #1
    Member
    Join Date
    Jan 2016
    Location
    Moscow
    Posts
    2
    Thanks
    1
    Thanked 0 Times in 0 Posts

    Smile Help to detect unpacker

    Hi there,
    I'm Pehat and this is my first post in this forum. My hobby is reverse engineering of abandonware DOS games. Recently I've found some kind of Lempel-Ziv family unpacker in two absolutely different old games. Its source code matches byte-to-byte in both files, so it seems to me that it's kind of standard compression algorithm popular in early 90s which should have some C or asm implementation. However, my knowledge of then compressors is poor, so I've tried to find something similar. It looks similar to LZRW1 decompressor, but has other system of control bit commands. I've tried to make my x86 asm listing neaty and added some comments. Here it is:
    Code:
    ; This is LZ-like decompressor. I wish I know the author.
    ; It reads the stream of control words and data bytes and writes the output
    ; according to the control sequences commands:
    ; 1  - copy single byte
    ; 00 - copy, small offset (signed byte), 2..5 bytes to copy (two control bits)
    ; 01 - copy, big offset (13 bits), 2..9 bytes to copy (3 bits), 
    ;      stored in packed word oooooccc oooooooo, where "o" is offset, "c" is count;
    ;      in case of ccc == 000, count is the next byte from stream plus 1.
    ;
    ; Registers usage:
    ; AX - general purpose (control words, data bytes)
    ; SI - source index, offset of packed data
    ; DI - destination index, offset of unpacked data
    ; BP - control word
    ; DX - number of bits in control word remaining
    ; CX - size of block to copy from behind
    ; BX - offset of block to copy from behind (signed int16, negative)
    
    
    proc    decompress near
            mov dx, 10h             ; DX is a bit counter for BP
            lodsw
            mov bp, ax              ; BP is a control word (contains bit commands)
    
    
    HandleBits:
            shr bp, 1               ; Get the least significant bit (LSB) of BP into CF and shift right BP.
            dec dx
            jnz short GetCopyMode
    ; Read new control word if the old one is exhausted. Below there are several checks like this.
            lodsw
            mov bp, ax
            mov dl, 10h
    
    
    GetCopyMode:
            jnb short GetBytesCount ; CF=0, need to count number of bytes to copy
            movsb                   ; CF=1, single byte needs to be copied
            jmp short HandleBits
    
    
    GetBytesCount:
            xor cx, cx              ; CX needs to contain number of bytes to copy
            shr bp, 1
            dec dx
            jnz short GetOffsetSize
            lodsw
            mov bp, ax
            mov dl, 10h
    
    
    GetOffsetSize:
            jb  short ReadBigOffset ; CF=1, offset is big
            shr bp, 1               ; CF=0, offset is small
            dec dx
            jnz short FewBytesToCopy; we need to copy from 2 up to 5 bytes
            lodsw
            mov bp, ax
            mov dl, 10h
    
    
    FewBytesToCopy:
            rcl cx, 1               ; get bit 1 of the count from CF into CX LSB
            shr bp, 1
            dec dx
            jnz short LookBehind
            lodsw
            mov bp, ax
            mov dl, 10h
    
    
    LookBehind:
            rcl cx, 1               ; get bit 0 of the count from CF into CX LSB
            inc cx                  ; add 2
            inc cx                  ; because we won't waste data for 1 bit copying
            lodsb                   ; time to read offset
            mov bh, 0FFh            ; it's gonna be negative 'cause we copy from our tail
            mov bl, al              ; now BX contains negative offset
            jmp short CopyDataBytes
            nop                     ; alignment instruction, never mind
    
    
    ReadBigOffset:
            lodsw                   ; this word is packed
            mov bx, ax              ; BL becomes lower part of the offset
            mov cl, 3
            shr bh, cl              ; get 5 higher bits of BH
            or  bh, 0E0h            ; and set 3 higher bits; full offset is constructed
            and ah, 7               ; get 3 lower bits of AH
            jz  short LotsBytesToCopy
            mov cl, ah
            inc cx                  ; in this part we're going to copy from 2 up to 9 bytes, as you see
            inc cx
    
    
    CopyDataBytes:
            mov al, [es:bx+di]
            stosb
            loop    CopyDataBytes
            jmp short HandleBits
    
    
    LotsBytesToCopy:
            lodsb                   ; in this part, we can copy up to 256 bytes
            or  al, al
            jz  short UnpackComplete; end of the stream
            mov cl, al              ; otherwise, copy some more bytes
            inc cx
            jmp short CopyDataBytes
    
    
    UnpackComplete:
            retn
    endp    decompress
    Is anyone can help me to identify the decompression algorithm?

  2. #2
    Programmer Bulat Ziganshin's Avatar
    Join Date
    Mar 2007
    Location
    Uzbekistan
    Posts
    4,475
    Thanks
    702
    Thanked 645 Times in 347 Posts
    dmitry bortoq said that it looks like lzexe

  3. The Following User Says Thank You to Bulat Ziganshin For This Useful Post:

    Pehat (4th January 2016)

  4. #3
    Member
    Join Date
    Jan 2016
    Location
    Moscow
    Posts
    2
    Thanks
    1
    Thanked 0 Times in 0 Posts
    It really is! Great thanks!

  5. #4
    Member
    Join Date
    Nov 2015
    Location
    boot ROM
    Posts
    81
    Thanks
    25
    Thanked 13 Times in 12 Posts
    LZEXE compression can be quickly recognized by very specific "LZ91" text placed close to begin of executable. This is "hallmark" of LZEXE. LZEXE has been used quite a lot during DOS ages to compress executables. Because ... not like if there was big choice anyway. I'm not aware of LZEXE use for something else than compressing EXEs though, so I guess you're looking on some old exes?

  6. #5
    Programmer Bulat Ziganshin's Avatar
    Join Date
    Mar 2007
    Location
    Uzbekistan
    Posts
    4,475
    Thanks
    702
    Thanked 645 Times in 347 Posts
    lzex—É was a first exe packer and the only one for some time

  7. #6
    Member
    Join Date
    Oct 2009
    Location
    usa
    Posts
    54
    Thanks
    1
    Thanked 8 Times in 5 Posts
    Hi Pehat,
    I also share your passion of decompressing old DOS games. There also were the MZ .exe compressors Diet, pklite, and then UPX came along and crushed them all.

    One I've never been able to decompress or identify is the program called Game Wizard 3.0. Its exe seems to be both compressed with some type of LZ1 variant and encrypted as well. I'd like to reverse engineer it. All the program data is similarly compressed... Take a look at it and tell me what you think.

    Also, the old Apogee games were LZEXE compressed AND had a checksum, so they wouldn't run if they had been decompressed. Could never figure out how to hexedit those .EXEs to disable the checksum, and so run them decompressed.

  8. #7
    Member
    Join Date
    Feb 2014
    Location
    Belgium
    Posts
    2
    Thanks
    0
    Thanked 1 Time in 1 Post

    Unp

    If you are trying to uncompress old DOS .exes Unp by Ben Castricum is a good starting point. http://unp.bencastricum.nl It also has a generic tracer mode which should uncompress unknown packers.

  9. #8
    Member
    Join Date
    Nov 2015
    Location
    boot ROM
    Posts
    81
    Thanks
    25
    Thanked 13 Times in 12 Posts
    Quote Originally Posted by zyzzle View Post
    One I've never been able to decompress or identify is the program called Game Wizard 3.0. Its exe seems to be both compressed with some type of LZ1 variant and encrypted as well. I'd like to reverse engineer it.
    The basic idea is like this: let it unpack, decrypt and do CRC itself and then dump memory image and feel free to research it. You do not really need to know exact encryption/compression/checksum algos in most cases. Real-mode DOS had no paging. So there was little option but to decrunch whole image and launch it. This means you can dump "real" code image at some point, it no longer going to be packed or encrypted. It can be much worse in modern systems due to paging. There was plenty of (semi)automatic tools which would try to remove packers for you. It may or may not work, though - in some cases you may need e.g. to specify entry point manually, if autodetection fails. In modern age one probably can launch their stuff in VM like qemu and try to use VM memory access facilities to investigate what's going on, obvious advantage is that you do not have to care if program & OS are really alive.

    As for checksums, if checksum gets in the way, it can sometimes be a bit easier than you think. Once I got checksum on my way. I had no idea where it stored or who checks it. I've just patched file first, and computed linear 16 bit sum for whole. Mismatch. I've patched some unimportant parts of file to make 16-bit sum matching original file. Whoa, works! But it was really lucky shot. Could be harder to do on advanced checksums and not going to work for cryptographic hashes. But these were not used during DOS ages. Most of time it just either linear 8/16/32-bit checksum, or maybe XOR of all bytes (words, ...). In some strange case it can be CRC, but CRC is relatively slow, so it rarely used.

Similar Threads

  1. PAKKA (ZPAQ's Win32 "versioned" unpacker)
    By fcorbelli in forum Data Compression
    Replies: 21
    Last Post: 24th June 2015, 23:29
  2. Detect And Segment TAR By Headers?
    By comp1 in forum Data Compression
    Replies: 32
    Last Post: 15th June 2014, 13:49
  3. How to detect compression of file
    By achik961 in forum Data Compression
    Replies: 10
    Last Post: 19th January 2013, 05:01
  4. Replies: 4
    Last Post: 2nd December 2012, 03:55

Tags for this Thread

Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts
  •