Results 1 to 7 of 7

Thread: Hexadecimal collisions

  1. #1
    Member
    Join Date
    Jun 2018
    Location
    Slovakia
    Posts
    80
    Thanks
    22
    Thanked 3 Times in 3 Posts

    Hexadecimal collisions

    I´d like to ask you if hexadecimal interpretation is collision-free.
    Last edited by CompressMaster; 4th June 2019 at 20:56. Reason: typo

  2. #2
    Administrator Shelwien's Avatar
    Join Date
    May 2008
    Location
    Kharkov, Ukraine
    Posts
    3,134
    Thanks
    179
    Thanked 921 Times in 469 Posts
    Depends on interpretation.
    A nibble value (4 bits) can be bijectively mapped to [0-9A-F] hex digit.
    So a byte (two nibbles) can be mapped to two hex digits.

    But some kinds of hex syntax can be redundant.
    For example, hex constants in C/C++ (0x...) allow for any number of leading zeroes, support both [a-f] and [A-F] for 10..15,
    and there're also optional type suffixes.
    This kind of syntax is redundant - provides multiple ways to encode the same binary data.

  3. The Following User Says Thank You to Shelwien For This Useful Post:

    CompressMaster (10th June 2019)

  4. #3
    Member
    Join Date
    Jun 2018
    Location
    Slovakia
    Posts
    80
    Thanks
    22
    Thanked 3 Times in 3 Posts
    I mean something like this - if I will have 257 1-byte text files with only one string - all latin + non-latin (like "ф") characters extracted from charmap, there are more than 257 characters and the collision must happen in principle. So, how binary compiler knows the difference that there is "A" instead of "ф" encoded under same hex value? And I´m afraid that even some characters are encoded with more that two nibbles (3 or 4) - so the collision probability is quite far - how compiler knows that encode "AD" instead of "4D8A"?

    So, not every file can be expressed in hexadecimal interpretation and compiled back to binary?

    For larger files (10 bytes or so), the probability of collisions is quite, quite far, but it´s possible in general.

  5. #4
    Administrator Shelwien's Avatar
    Join Date
    May 2008
    Location
    Kharkov, Ukraine
    Posts
    3,134
    Thanks
    179
    Thanked 921 Times in 469 Posts
    1) You can't have 257 different 1-byte files, since 1 byte has only 256 different values.

    2) Currently there's no direct mapping of binary data (byte values) to characters.
    Some codes have special meaning (end-of-line etc), some of others might be ignored or replaced by a texteditor.
    https://en.wikipedia.org/wiki/Character_encoding

    3) Single character in text is not always encoded with one data byte.
    In your example with "A"/"ф", "A" = 41 and "ф" = D1 84 (utf8)

    4) You can disassemble any file to asm db 0x?? hex syntax and assemble it back losslessly.

  6. The Following User Says Thank You to Shelwien For This Useful Post:

    CompressMaster (10th June 2019)

  7. #5
    Member
    Join Date
    Jun 2018
    Location
    Slovakia
    Posts
    80
    Thanks
    22
    Thanked 3 Times in 3 Posts
    Quote Originally Posted by Shelwien View Post
    4) You can disassemble any file to asm db 0x?? hex syntax and assemble it back losslessly.
    Is there some CMD software for that?

  8. #6
    Administrator Shelwien's Avatar
    Join Date
    May 2008
    Location
    Kharkov, Ukraine
    Posts
    3,134
    Thanks
    179
    Thanked 921 Times in 469 Posts

  9. The Following User Says Thank You to Shelwien For This Useful Post:

    CompressMaster (11th June 2019)

  10. #7
    Member
    Join Date
    Jun 2018
    Location
    Slovakia
    Posts
    80
    Thanks
    22
    Thanked 3 Times in 3 Posts
    Thanks Shelwien! Now it works as expected. But I´ve prior used binary-hexadecimal (and vice versa) CMD converter, but it does not work properly because I´ve specified input in hexadecimal instead of binary. My mistake, sorry.

Similar Threads

  1. Dedup collisions in obnam
    By Matt Mahoney in forum Data Compression
    Replies: 6
    Last Post: 24th March 2014, 04:03

Tags for this Thread

Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts
  •