Results 1 to 10 of 10

Thread: 60% of disk storage is used for copies

  1. #1
    Expert
    Matt Mahoney's Avatar
    Join Date
    May 2008
    Location
    Melbourne, Florida, USA
    Posts
    3,255
    Thanks
    306
    Thanked 778 Times in 485 Posts

    60% of disk storage is used for copies

    According to http://www.storagenewsletter.com/new...y-data-problem

    Out of 100 EB of disk storage systems expected to be shipped worldwide in 2013, 61 EB will be used to store copies of data, based on current rates.
    This backup storage will cost $34 billion, or $560 per TB.

    In addition, almost all of 90 EB of tape storage will be used to store backups at a cost of $1.9 billion, or $21 per TB.

  2. #2
    Programmer Bulat Ziganshin's Avatar
    Join Date
    Mar 2007
    Location
    Uzbekistan
    Posts
    4,497
    Thanks
    733
    Thanked 659 Times in 354 Posts
    i'm pretty sure that 100% of my video archive are copies, the same is true for any home user today. anyway, i can type more that a few bytes per second, the rest are either copies or derivatives

  3. #3
    Member m^2's Avatar
    Join Date
    Sep 2008
    Location
    Ślůnsk, PL
    Posts
    1,612
    Thanks
    30
    Thanked 65 Times in 47 Posts
    Quote Originally Posted by Bulat Ziganshin View Post
    i'm pretty sure that 100% of my video archive are copies, the same is true for any home user today. anyway, i can type more that a few bytes per second, the rest are either copies or derivatives
    I can type a program that would generate some gigabytes/second.

  4. #4
    Member
    Join Date
    Feb 2013
    Location
    San Diego
    Posts
    1,057
    Thanks
    54
    Thanked 71 Times in 55 Posts
    Of course, we compression nuts know that data repeats itself more often than not, at any scale you care to look at. Like a fractal.

    Check out "long memory stochastic process," "long range correlations," "1/f noise"... it's mysterious.

  5. #5
    Expert
    Matt Mahoney's Avatar
    Join Date
    May 2008
    Location
    Melbourne, Florida, USA
    Posts
    3,255
    Thanks
    306
    Thanked 778 Times in 485 Posts
    Of course, copying isn't waste, like the article seems to imply. Having 2.5 copies of your data (on average) is a good strategy to prevent loss. And the rate is actually much higher if you include data that you can re-create or re-install. I can only type about 1 MB of text per year. Back when computers had floppy disks, I would back up only what I created myself and it would take a year to fill up a disk.

    I suppose the trend is going to be toward more replication as storage gets cheaper. It is not unusual for a cluster of a million CPUs to each have a copy of the operating system. Your body has a trillion identical copies of your DNA.

  6. #6
    Expert
    Matt Mahoney's Avatar
    Join Date
    May 2008
    Location
    Melbourne, Florida, USA
    Posts
    3,255
    Thanks
    306
    Thanked 778 Times in 485 Posts
    Of course, copying isn't waste, like the article seems to imply. Having 2.5 copies of your data (on average) is a good strategy to prevent loss. And the rate is actually much higher if you include data that you can re-create or re-install. I can only type about 1 MB of text per year. Back when computers had floppy disks, I would back up only what I created myself and it would take a year to fill up a disk.

    I suppose the trend is going to be toward more replication as storage gets cheaper. It is not unusual for a cluster of a million CPUs to each have a copy of the operating system. Your body has a trillion identical copies of your DNA.

  7. #7
    Member
    Join Date
    Apr 2012
    Location
    Stuttgart
    Posts
    437
    Thanks
    1
    Thanked 96 Times in 57 Posts
    Quote Originally Posted by Matt Mahoney View Post
    Of course, copying isn't waste, like the article seems to imply. Having 2.5 copies of your data (on average) is a good strategy to prevent loss. And the rate is actually much higher if you include data that you can re-create or re-install. I can only type about 1 MB of text per year. Back when computers had floppy disks, I would back up only what I created myself and it would take a year to fill up a disk.
    Creating copies for backup is one thing, but duplicates also exist in storage and computing centers because similar installations exist on multiple instances of virtual or real machines. However, this is a known issue, and most modern storage solutions provide data de-duplication algorithms to address this, i.e. the data is only stored once, but looks like multiple copies to the user. As soon as one copy is modified, the data is replicated again.

    Of course, this is not a backup strategy - if you need backups, you need something else. That's typically also part of professional storage solutions.

  8. #8
    Expert
    Matt Mahoney's Avatar
    Join Date
    May 2008
    Location
    Melbourne, Florida, USA
    Posts
    3,255
    Thanks
    306
    Thanked 778 Times in 485 Posts
    How ironic. 2 copies of my post saying copying is not a waste.
    (I got a server error when I posted and I tried again)

    But yes, I agree that copying rate is much higher than 60%. Their results were based on a survey, but most people don't know what is on their computers. At least I don't.

  9. #9
    Member
    Join Date
    Feb 2013
    Location
    San Diego
    Posts
    1,057
    Thanks
    54
    Thanked 71 Times in 55 Posts
    Quote Originally Posted by Matt Mahoney View Post
    But yes, I agree that copying rate is much higher than 60%. Their results were based on a survey, but most people don't know what is on their computers. At least I don't.
    Yeah... there are probably a lot of unintentional copies that people don't know about.

    The problem of too many copies sounds pretty easy to solve, since computers can deduplicate automatically.

    Of course, copying isn't waste, like the article seems to imply. Having 2.5 copies of your data (on average) is a good strategy to prevent loss. And the rate is actually much higher if you include data that you can re-create or re-install. I can only type about 1 MB of text per year. Back when computers had floppy disks, I would back up only what I created myself and it would take a year to fill up a disk.
    I think it's arguing that backup should be handled in a methodical way, rather than through haphazard copies.

    If you were making videos or audio recording, you'd probably be generating more data.

  10. #10
    Member
    Join Date
    Mar 2010
    Location
    Germany
    Posts
    116
    Thanks
    18
    Thanked 32 Times in 11 Posts
    Quote Originally Posted by Matt Mahoney View Post
    60% of disk storage is used for copies
    60% of my disk space are free, because of Bulat 's srep

Similar Threads

  1. Infinite Storage? It does exists...
    By moisesmcardona in forum The Off-Topic Lounge
    Replies: 6
    Last Post: 29th April 2013, 03:34
  2. DNA storage
    By Shelwien in forum The Off-Topic Lounge
    Replies: 3
    Last Post: 25th January 2013, 23:42
  3. How to use ultra-fast storage media?
    By wety in forum Data Compression
    Replies: 2
    Last Post: 5th January 2013, 00:14
  4. Idea for raising compression efficiency on disk images
    By Mexxi in forum Data Compression
    Replies: 10
    Last Post: 18th February 2010, 05:56
  5. Replies: 1
    Last Post: 13th May 2009, 10:46

Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts
  •