Results 1 to 5 of 5

Thread: questions about data correction

  1. #1
    Member just a worm's Avatar
    Join Date
    Aug 2013
    Location
    planet "earth"
    Posts
    96
    Thanks
    29
    Thanked 6 Times in 5 Posts

    questions about data correction

    Hello community,
    I would like to have a look into the topic "data correction".

    Let's say we have an archive which has some bits wrongly set and we want to correct the values of thouse bits.

    Can checksums be used to correct the data?

    How do you usually determine which values the bits and bytes should have?

    Does someone know/have some good materials to read?

    Do you think that it makes sense to add additional data to correct the pay load to a general purpose archive which is beeing distributed over the internet?

    Thank you in advance.

  2. #2
    Programmer Bulat Ziganshin's Avatar
    Join Date
    Mar 2007
    Location
    Uzbekistan
    Posts
    4,497
    Thanks
    733
    Thanked 659 Times in 354 Posts
    i was a bit involved in development of the new RAR5 error correction scheme. Plank's research papers was almost everything we've used in this work. you can start with the tutorial

  3. The Following User Says Thank You to Bulat Ziganshin For This Useful Post:

    just a worm (28th November 2013)

  4. #3
    Member
    Join Date
    Feb 2013
    Location
    San Diego
    Posts
    1,057
    Thanks
    54
    Thanked 71 Times in 55 Posts
    Interesting research. His fast Galois field library would be good for hashing and probably other stuff.

  5. #4
    Member just a worm's Avatar
    Join Date
    Aug 2013
    Location
    planet "earth"
    Posts
    96
    Thanks
    29
    Thanked 6 Times in 5 Posts
    Thank you. I wasn't aware of the xor-trick. It seems pretty helpful.

  6. #5
    Member
    Join Date
    Jan 2014
    Location
    Russia
    Posts
    24
    Thanks
    0
    Thanked 2 Times in 2 Posts
    Quote Originally Posted by just a worm View Post
    Can checksums be used to correct the data?
    You probably read that CRC can repair a single bit and Hamming Codes can repair a couple of bits…

    Protection of files and archives commonly works in a different way.

    1. All data are cut into blocks of the same size
    2. Vertical slices of blocks multiplies with Generator Matrix
    3. Multiplication produces slices of parity
    4. Slices of parity combines into recovery blocks

    There are two often ways to build Generator Matrix:

    1) Simple interleaved XOR-Matrix:

    100010001000
    010001000100
    001000100010
    000100010001

    Simple XOR can repair one contiguous burst error and some random errors, but there are unsolvable errors, because simple XOR is weak. In matrix above 1st and 5th data symbols cannot be repaired together;

    2) Reed-Solomon Codes use Matrix without zeros over a Finite Field:
    Vandermonde Matrix : Aij=i^j
    Cauchy Matrix: Aij=1/(i-j)

    In case of Reed-Solomon Codes there are no unsolvable errors, every symbol of parity can repair any lost data symbol, and any recovery block can repair any data block in any combinations.

    The integrity of blocks is verified by CRC32 or MD5.

    So for data protection you should cut your data into blocks and calculate some recovery blocks. Then, when CRC of some data blocks become wrong, you can repair them with equal amount of recovery blocks.

    You can download ICEECC from ice-graphics or my RSC32 from livebusinesschat to play with. My RSC32 supports Matrix up to 2000000x2000000 under Win32 (and 64-bit does not exist -))) )

    So you can say split your data into 1000000 blocks and calculate 100000 recovery blocks, i.e. 10% of redundancy. Every recovery block can repair every data block

Similar Threads

  1. FreeArc usability questions
    By TheEmptyMind in forum Data Compression
    Replies: 12
    Last Post: 14th July 2013, 14:36
  2. Greetings, Questions, and Benchmarks
    By musicdemon in forum Data Compression
    Replies: 4
    Last Post: 8th January 2012, 22:45
  3. Questions about compression
    By 0011110100101001 in forum Data Compression
    Replies: 12
    Last Post: 8th December 2011, 02:31
  4. Bunch of stupid questions
    By chornobyl in forum Data Compression
    Replies: 28
    Last Post: 6th December 2008, 18:26
  5. Data Distribution Questions.
    By Tribune in forum Data Compression
    Replies: 13
    Last Post: 25th June 2008, 18:09

Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts
  •