Big data management: from hard drives to DNA drives
Abstract views: 346 / PDF downloads: 82
https://doi.org/10.56093/ijans.v90i2.98761
Keywords:
Big data, Coding, DNA drives, StorageAbstract
Information Communication and Technology is transforming all aspects of modern life and in this digital era, there is a tremendous increase in the amount of data that is being generated every day. The current, conventional storage devices are unable to keep pace with this rapidly growing data. Thus, there is a need to look for alternative storage devices. DNA being exceptional in storage of biological information offers a promising storage capacity. With its unique abilities of dense storage and reliability, it may prove better than all conventional storage devices in near future. The nucleotide bases are present in DNA in a particular sequence representing the coded information. These are the equivalent of binary letters (0 &1). To store data in DNA, binary data is first converted to ternary or quaternary which is then translated into the nucleotide code comprising 4 nucleotide bases (A, C, G, T). A DNA strand is then synthesized as per the code developed. This may either be stored in pools or sequenced back. The nucleotide code is converted back into ternary and subsequently the binary code which is read just like digital data. DNA drives may have a wide variety of applications in information storage and DNA steganography.Downloads
References
Alberts B, Johnson A, Lewis J, Raff M, Roberts K and Walter P. 2003. DNA replication mechanisms.Molecular Biology of the Cell (4th edition). Garland Science. New York.
Anonymous. 2013. Where in the world is storage. http:// www.idc.com/downloads/where_is_storage_ infographic_ 243338.pdf.
Bancroft C, Bowler T, Bloom B and Clelland K.T. 2001. Longterm storage of information in DNA. Science 293: 1763–65. DOI: https://doi.org/10.1126/science.293.5536.1763c
Benner S A, Yang Z and Chen F. 2011. Synthetic biology, tinkering biology, and artificial biology. What are we learning? Comptes Rendus Chimie 14(4): 372–87. DOI: https://doi.org/10.1016/j.crci.2010.06.013
Bornholt J, Lopez R, Douglas, Carmean M, Ceze L, Seelig G and Strauss K. 2016. A DNA-Based Archival Storage System. Proceedings of the Twenty-First International Conference on Architectural Support for Programming Languages and Operating Systems. p. 637–649. DOI: https://doi.org/10.1145/2954680.2872397
Bright P. 2016. Microsoft experiments with DNA storage: 1,000,000,000 TB in a gram. ArsTechnica. https://arstechnica. com/informationtechnology/2016/04/microsoftexperiments withdnastorage1000000000tbinagram/
Calladine C, Drew H, Luisi B and Travers A. 2004. An Introduction to Molecular Biology for Non-Scientists. pp. 1– 17. Understanding DNA. Elsevier Academic Press, California, USA. DOI: https://doi.org/10.1016/B978-012155089-9/50001-6
Carlson R. 2014. Time for new DNA synthesis and sequencing cost curves. http://www.synthesis.cc/2014/02/ time-for-newcost- curves-2014.html.
Carr P A and Church G M. 2009. Genome engineering. Nature Biotechnology 27: 115-62. DOI: https://doi.org/10.1038/nbt.1590
Castillo M. 2014. From hard drives to flash drives to DNA drive. American Journal of Neuroradiology 35: 1–2. DOI: https://doi.org/10.3174/ajnr.A3482
Church G M, Gao Y and Kosuri S. 2012. Next-generation digital information storage in DNA. Science 337(6102): 1628. DOI: https://doi.org/10.1126/science.1226355
Clelland C T, Risca V and Bancroft C. 1999. Hiding messages in DNA microdots. Nature 399: 533–34. DOI: https://doi.org/10.1038/21092
Conrad M. 1990. Quantum mechanics and cellular information processing: The self-assembly paradigm. Biomedica biochimica acta 49: 743–55.
Conrad M and Zauner K P. 1997. DNA as a vehicle for the selfassembly model of computing. BioSystems 45: 59–66. DOI: https://doi.org/10.1016/S0303-2647(97)00062-2
Erlich Y and Zielinski D. 2017. DNA Fountain enables a robust and efficient storage architecture. Science 355(6328): 950– 54. DOI: https://doi.org/10.1126/science.aaj2038
Extance A. 2016. How DNA could store all the world’s data. Nature 537: 22–24. DOI: https://doi.org/10.1038/537022a
Gibson D G, Glass J I, Lartigue C, Noskov V N, Chuang R Y, Algire M A, Benders G A, Montague M G, Ma L, Moodie M M, Merryman C, Vashee S, Krishnakumar R, Assad-Garcia N, Andrews-Pfannkoch C, Denisova E A, Young L, Qi Z Q, Segall-Shapiro T H, Calvey C H, Parmar P P, Hutchison C A, Smith H O and Venter J C. 2010. Creation of a bacterial cell controlled by a chemically synthesized genome. Science 329(5987): 52–56. DOI: https://doi.org/10.1126/science.1190719
Goldman N, Bertone P, Chen S, Dessimoz C, LeProust E M, Sipos B and Birney E. 2013. Towards practical, high-capacity, low maintenance information storage in synthesized DNA. Nature 494: 77–80. DOI: https://doi.org/10.1038/nature11875
Gottlieb A and Almasi G. S. 1989. Highly Parallel Computing. Benjamin/Cummings Publishing Cooperation. California.
Grass R N, Heckel R, Puddu M, Paunescu D and Stark W. J. 2015. Robust chemical preservation of digital information on DNA in silica with error-correcting codes. Angewandte Chemie International 54: 2552–55. DOI: https://doi.org/10.1002/anie.201411378
Guo Q, Strauss K, Ceze L and Malvar H. 2016. High-density image storage using approximate memory cells. Proceedings of the Twenty-First International Conference on Architectural Support for Programming Languages and Operating Systems. Atlanta, Georgia, USA, 02–06 April. pp 413–426. DOI: https://doi.org/10.1145/2872362.2872413
Herkewitz W. 2016. Scientists Turn Bacteria into Living Hard Drives. Popular Mechanics. http://www.popularmechanics. com/science/animals/a21268/scientiststurnbacteriaintoliving harddrives/
Huffman D. 1952. A method for the construction of minimumredundancy codes. Proceedings of the IRE 40(9): 1098–1101. DOI: https://doi.org/10.1109/JRPROC.1952.273898
Kamilaris A, Kartakoullis A and Prenafeta-Boldú F X. 2017. A review on the practice of big data analysis in agriculture. Computers and Electronics in Agriculture. https://doi.org/ 10.1016/j.compag.2017.09.037 DOI: https://doi.org/10.1016/j.compag.2017.09.037
Langston J. 2016. UW team stores digital images in DNA — and retrieves them perfectly. UW Today. http://www.washington. edu/news/2016/04/07/uwteamstoresdigitalimagesindnaand retrievesthemperfectly/
Langston J. 2019. With a “hello,” Microsoft and UW demonstrate first fully automated DNA data storage. Microsoft.https:// news.microsoft.com/innovation-stories/hello-data-dnastorage/
Leier A, Richter C, Banzhaf W and Rauhe H. 2000. Cryptography with DNA binary strands. Biosystems 57(1): 13–22. DOI: https://doi.org/10.1016/S0303-2647(00)00083-6
Leo R A. 2012. Writing the Book in DNA. Harward Medical School. https://hms.harvard.edu/news/writing-book-dna-8-16- 12
Limbachiya D and Gupta M K. 2015. Natural data storage: a review on sending information from now to then via Nature. Journal on Emerging Technologies in Computing Systems. arXiv preprint arXiv: 1505.04890.
Miller R. 2011. How Many Data Centers? Emerson Says 500,000. Data Center Knowledge. http://www.datacenter knowledge.com/archives/2011/12/14/how-many-data-centersemerson- says-500000/
Nguyen H H, Park J, Park S J, Lee C S, Hwang S, Shin Y B, Ha T H and Kim M. 2018. Long-term stability and integrity of plasmid-based DNA data storage. Polymers 10(1): 28. DOI: https://doi.org/10.3390/polym10010028
Niedringhaus T P, Milanova D, Kerby M B, Snyder M P and Barron A E. 2011. Landscape of next-generation sequencing technologies. Analytical Chemistry 83: 4327–434. DOI: https://doi.org/10.1021/ac2010857
Perry K. 2014. DNA can survive re-entry into Earth’s atmosphere, Telegraph Media Group Limited. http://www.telegraph.co.uk/ news/newstopics/howaboutthat/11256420/DNA-can-survivere- entry-into-Earths-atmosphere.html
Ray S. 2019. DNA Data Storage. https://hackernoon.com/dnadata- storage-d0f0e93513b
Rosenblum A. 2016. Microsoft Reports a Big Leap Forward for DNA Data Storage. MIT Technology Review. https:// www.technologyreview.com/s/601851/microsoft-reports-abig- leap-forward-for-dna-data-storage/
Ross M G, Russ C, Costello M, Hollinger A, Lennon N J, Hegarty R, Nusbaum C and Jaffe D B. 2013. Characterizing and measuring bias in sequence data. Genome Biology 14(5): R51. DOI: https://doi.org/10.1186/gb-2013-14-5-r51
Schwartz J J, Lee C and Shendure J. 2012. Accurate gene synthesis with tag-directed retrieval of sequence-verified DNA molecules. Nature Methods 9(9): 913–15. DOI: https://doi.org/10.1038/nmeth.2137
Shrivastava S and Badlani R. 2014. Data Storage in DNA. International Journal of Electrical Energy 2(2): 120–24. DOI: https://doi.org/10.12720/ijoee.2.2.119-124
Singleton M. 2016. Seagate has built a 60TB SSD, the world’s largest. https://www.theverge.com/circuitbreaker/2016/8/10/ 12424666/seagate-60tb-ssd-worlds-largest
Smith G C, Fiddes C C, Hawkins J P and Cox J P L. 2003. Some possible codes for encrypting data in DNA. Biotechnology Letters 25: 1125–30. DOI: https://doi.org/10.1023/A:1024539608706
Stewart J. 2011. Global data storage calculated at 295 exabytes. http://www.bbc.com/news/technology-12419672
Sudha P and Valli S. 2017. A study of parallel processing and its contemporary relevance. International Journal of Computer Science 5(20).
Wong P, Wong K and Foote H. 2003. Organic data memory using the DNA approach. Communications of the ACM 46: 95. DOI: https://doi.org/10.1145/602421.602426
Yazdi S M H T, Yuan Y, Ma J, Zhao H and Milenkovic O. 2015. A rewritable, random-access DNA-Based storage system. Nature Scientific Reports 5: 143. DOI: https://doi.org/10.1038/srep14138
Zhirnov V, Zadegan R M, Sandhu G S, Church G M and Hughes W L. 2016. Nucleic acid memory. Nature Mater 15: 366–70. DOI: https://doi.org/10.1038/nmat4594
Downloads
Submitted
Published
Issue
Section
License
The copyright of the articles published in The Indian Journal of Animal Sciences is vested with the Indian Council of Agricultural Research, which reserves the right to enter into any agreement with any organization in India or abroad, for reprography, photocopying, storage and dissemination of information. The Council has no objection to using the material, provided the information is not being utilized for commercial purposes and wherever the information is being used, proper credit is given to ICAR.