Four years ago researchers at Harvard University were able to successfully store 700 terabytes of data in a gram of DNA. It wasn’t the first time that DNA was successfully used to store data but it was the first time that a very large volume of data was stored successfully. The work done at Harvard’s Wyss Institute proved a 1000 fold increase over what has been able to be stored in the past.
We all understand the binary system used to store date on magnetic media. To store data on DNA, the TGAC basis (the genome mapping storage system) is converted to binary. The Ts & Gs represent ones and the As and Cs represent zeros, or more appropriately, pluses and minuses. And as we know, a useable storage system has to have an effective addressing scheme. DNA has a 19-bit address block at the start of each strand. To access the data, the DNA is sequenced, the same thing you would do to access the human genome. Sounds simple, right? For the same reason that I don’t really care how my Power System does it, I don’t care how the DNA experts do it, as long as I know they can access it and access it quickly.
So what does this have to do with real life and all the things you have to worry about on a daily basis? Well, Microsoft recently purchased 10 million strands of synthetic DNA to start exploring the idea of storing massive amounts of data on synthetic DNA. According to people who study this stuff, last year there were 4.4 trillion gigabytes of data stored around the globe. Within the next 4 years that number is expected to increase by a factor of 10. Extend that out to 2025 or 2030 – where are we going to put all that data? Some smart people at places like Harvard and Microsoft think that will be DNA. One gram of DNA (0.037274 of an ounce) can store up to 1 billion terabytes of data (that’s billion terabytes). Based on this information, the 4.4 trillion gigabytes of last year’s data could be stored on 20 grams of DNA (0.705479 ounce), less than three quarters of an ounce of synthetic DNA.
So one of the first questions that should be asked is – What about life expectancy of the data? Lets look at magnetic media. Manufactures will tell you that the lifespan of data stored on a magnetic surface is 10 to 20 years. Interestingly, the lifespan of data stored on CD or DVD media isn’t much different. Industry claims vary anywhere from 2 to 5 years to 10 to 25 years. The expected lifespan of data stored on DNA is anywhere from 1000 years to 10,000 years (did you watch Jurassic Park?).
So where is this going? I’m not smart enough to tell you that. I wouldn’t expect to see a DNA drive in your Power System server or DNA flash in your SAN any time soon. However, everything has to start somewhere and if Moore’s Law was meant to be broken, this could do it.
Michael Miller, President, Arbor Solutions, Inc.
616-451-2500
mmiller@arbsol.com
Leave a Reply