Big data management: from hard drives to DNA drives


Abstract views: 346 / PDF downloads: 82

Authors

  • AMBREEN HAMADANI PhD Scholar, Sher-e-Kashmir University of Agricultural Sciences and Technology of Kashmir, Srinagar, Jammu and Kashmir 190 006 India
  • NAZIR A GANAI 2Director Planning and Monitoring, Sher-e-Kashmir University of Agricultural Sciences and Technology of Kashmir, Shalimar, Srinagar, J&K
  • SHAH F FAROOQ Research Scholar, Central University of Kashmir, Jammu and Kashmir.
  • BASHARAT A BHAT Researcher, University of Otago, New Zealand

https://doi.org/10.56093/ijans.v90i2.98761

Keywords:

Big data, Coding, DNA drives, Storage

Abstract

Information Communication and Technology is transforming all aspects of modern life and in this digital era, there is a tremendous increase in the amount of data that is being generated every day. The current, conventional storage devices are unable to keep pace with this rapidly growing data. Thus, there is a need to look for alternative storage devices. DNA being exceptional in storage of biological information offers a promising storage capacity. With its unique abilities of dense storage and reliability, it may prove better than all conventional storage devices in near future. The nucleotide bases are present in DNA in a particular sequence representing the coded information. These are the equivalent of binary letters (0 &1). To store data in DNA, binary data is first converted to ternary or quaternary which is then translated into the nucleotide code comprising 4 nucleotide bases (A, C, G, T). A DNA strand is then synthesized as per the code developed. This may either be stored in pools or sequenced back. The nucleotide code is converted back into ternary and subsequently the binary code which is read just like digital data. DNA drives may have a wide variety of applications in information storage and DNA steganography.

Downloads

Download data is not yet available.

References

Alberts B, Johnson A, Lewis J, Raff M, Roberts K and Walter P. 2003. DNA replication mechanisms.Molecular Biology of the Cell (4th edition). Garland Science. New York.

Anonymous. 2013. Where in the world is storage. http:// www.idc.com/downloads/where_is_storage_ infographic_ 243338.pdf.

Bancroft C, Bowler T, Bloom B and Clelland K.T. 2001. Longterm storage of information in DNA. Science 293: 1763–65. DOI: https://doi.org/10.1126/science.293.5536.1763c

Benner S A, Yang Z and Chen F. 2011. Synthetic biology, tinkering biology, and artificial biology. What are we learning? Comptes Rendus Chimie 14(4): 372–87. DOI: https://doi.org/10.1016/j.crci.2010.06.013

Bornholt J, Lopez R, Douglas, Carmean M, Ceze L, Seelig G and Strauss K. 2016. A DNA-Based Archival Storage System. Proceedings of the Twenty-First International Conference on Architectural Support for Programming Languages and Operating Systems. p. 637–649. DOI: https://doi.org/10.1145/2954680.2872397

Bright P. 2016. Microsoft experiments with DNA storage: 1,000,000,000 TB in a gram. ArsTechnica. https://arstechnica. com/informationtechnology/2016/04/microsoftexperiments withdnastorage1000000000tbinagram/

Calladine C, Drew H, Luisi B and Travers A. 2004. An Introduction to Molecular Biology for Non-Scientists. pp. 1– 17. Understanding DNA. Elsevier Academic Press, California, USA. DOI: https://doi.org/10.1016/B978-012155089-9/50001-6

Carlson R. 2014. Time for new DNA synthesis and sequencing cost curves. http://www.synthesis.cc/2014/02/ time-for-newcost- curves-2014.html.

Carr P A and Church G M. 2009. Genome engineering. Nature Biotechnology 27: 115-62. DOI: https://doi.org/10.1038/nbt.1590

Castillo M. 2014. From hard drives to flash drives to DNA drive. American Journal of Neuroradiology 35: 1–2. DOI: https://doi.org/10.3174/ajnr.A3482

Church G M, Gao Y and Kosuri S. 2012. Next-generation digital information storage in DNA. Science 337(6102): 1628. DOI: https://doi.org/10.1126/science.1226355

Clelland C T, Risca V and Bancroft C. 1999. Hiding messages in DNA microdots. Nature 399: 533–34. DOI: https://doi.org/10.1038/21092

Conrad M. 1990. Quantum mechanics and cellular information processing: The self-assembly paradigm. Biomedica biochimica acta 49: 743–55.

Conrad M and Zauner K P. 1997. DNA as a vehicle for the selfassembly model of computing. BioSystems 45: 59–66. DOI: https://doi.org/10.1016/S0303-2647(97)00062-2

Erlich Y and Zielinski D. 2017. DNA Fountain enables a robust and efficient storage architecture. Science 355(6328): 950– 54. DOI: https://doi.org/10.1126/science.aaj2038

Extance A. 2016. How DNA could store all the world’s data. Nature 537: 22–24. DOI: https://doi.org/10.1038/537022a

Gibson D G, Glass J I, Lartigue C, Noskov V N, Chuang R Y, Algire M A, Benders G A, Montague M G, Ma L, Moodie M M, Merryman C, Vashee S, Krishnakumar R, Assad-Garcia N, Andrews-Pfannkoch C, Denisova E A, Young L, Qi Z Q, Segall-Shapiro T H, Calvey C H, Parmar P P, Hutchison C A, Smith H O and Venter J C. 2010. Creation of a bacterial cell controlled by a chemically synthesized genome. Science 329(5987): 52–56. DOI: https://doi.org/10.1126/science.1190719

Goldman N, Bertone P, Chen S, Dessimoz C, LeProust E M, Sipos B and Birney E. 2013. Towards practical, high-capacity, low maintenance information storage in synthesized DNA. Nature 494: 77–80. DOI: https://doi.org/10.1038/nature11875

Gottlieb A and Almasi G. S. 1989. Highly Parallel Computing. Benjamin/Cummings Publishing Cooperation. California.

Grass R N, Heckel R, Puddu M, Paunescu D and Stark W. J. 2015. Robust chemical preservation of digital information on DNA in silica with error-correcting codes. Angewandte Chemie International 54: 2552–55. DOI: https://doi.org/10.1002/anie.201411378

Guo Q, Strauss K, Ceze L and Malvar H. 2016. High-density image storage using approximate memory cells. Proceedings of the Twenty-First International Conference on Architectural Support for Programming Languages and Operating Systems. Atlanta, Georgia, USA, 02–06 April. pp 413–426. DOI: https://doi.org/10.1145/2872362.2872413

Herkewitz W. 2016. Scientists Turn Bacteria into Living Hard Drives. Popular Mechanics. http://www.popularmechanics. com/science/animals/a21268/scientiststurnbacteriaintoliving harddrives/

Huffman D. 1952. A method for the construction of minimumredundancy codes. Proceedings of the IRE 40(9): 1098–1101. DOI: https://doi.org/10.1109/JRPROC.1952.273898

Kamilaris A, Kartakoullis A and Prenafeta-Boldú F X. 2017. A review on the practice of big data analysis in agriculture. Computers and Electronics in Agriculture. https://doi.org/ 10.1016/j.compag.2017.09.037 DOI: https://doi.org/10.1016/j.compag.2017.09.037

Langston J. 2016. UW team stores digital images in DNA — and retrieves them perfectly. UW Today. http://www.washington. edu/news/2016/04/07/uwteamstoresdigitalimagesindnaand retrievesthemperfectly/

Langston J. 2019. With a “hello,” Microsoft and UW demonstrate first fully automated DNA data storage. Microsoft.https:// news.microsoft.com/innovation-stories/hello-data-dnastorage/

Leier A, Richter C, Banzhaf W and Rauhe H. 2000. Cryptography with DNA binary strands. Biosystems 57(1): 13–22. DOI: https://doi.org/10.1016/S0303-2647(00)00083-6

Leo R A. 2012. Writing the Book in DNA. Harward Medical School. https://hms.harvard.edu/news/writing-book-dna-8-16- 12

Limbachiya D and Gupta M K. 2015. Natural data storage: a review on sending information from now to then via Nature. Journal on Emerging Technologies in Computing Systems. arXiv preprint arXiv: 1505.04890.

Miller R. 2011. How Many Data Centers? Emerson Says 500,000. Data Center Knowledge. http://www.datacenter knowledge.com/archives/2011/12/14/how-many-data-centersemerson- says-500000/

Nguyen H H, Park J, Park S J, Lee C S, Hwang S, Shin Y B, Ha T H and Kim M. 2018. Long-term stability and integrity of plasmid-based DNA data storage. Polymers 10(1): 28. DOI: https://doi.org/10.3390/polym10010028

Niedringhaus T P, Milanova D, Kerby M B, Snyder M P and Barron A E. 2011. Landscape of next-generation sequencing technologies. Analytical Chemistry 83: 4327–434. DOI: https://doi.org/10.1021/ac2010857

Perry K. 2014. DNA can survive re-entry into Earth’s atmosphere, Telegraph Media Group Limited. http://www.telegraph.co.uk/ news/newstopics/howaboutthat/11256420/DNA-can-survivere- entry-into-Earths-atmosphere.html

Ray S. 2019. DNA Data Storage. https://hackernoon.com/dnadata- storage-d0f0e93513b

Rosenblum A. 2016. Microsoft Reports a Big Leap Forward for DNA Data Storage. MIT Technology Review. https:// www.technologyreview.com/s/601851/microsoft-reports-abig- leap-forward-for-dna-data-storage/

Ross M G, Russ C, Costello M, Hollinger A, Lennon N J, Hegarty R, Nusbaum C and Jaffe D B. 2013. Characterizing and measuring bias in sequence data. Genome Biology 14(5): R51. DOI: https://doi.org/10.1186/gb-2013-14-5-r51

Schwartz J J, Lee C and Shendure J. 2012. Accurate gene synthesis with tag-directed retrieval of sequence-verified DNA molecules. Nature Methods 9(9): 913–15. DOI: https://doi.org/10.1038/nmeth.2137

Shrivastava S and Badlani R. 2014. Data Storage in DNA. International Journal of Electrical Energy 2(2): 120–24. DOI: https://doi.org/10.12720/ijoee.2.2.119-124

Singleton M. 2016. Seagate has built a 60TB SSD, the world’s largest. https://www.theverge.com/circuitbreaker/2016/8/10/ 12424666/seagate-60tb-ssd-worlds-largest

Smith G C, Fiddes C C, Hawkins J P and Cox J P L. 2003. Some possible codes for encrypting data in DNA. Biotechnology Letters 25: 1125–30. DOI: https://doi.org/10.1023/A:1024539608706

Stewart J. 2011. Global data storage calculated at 295 exabytes. http://www.bbc.com/news/technology-12419672

Sudha P and Valli S. 2017. A study of parallel processing and its contemporary relevance. International Journal of Computer Science 5(20).

Wong P, Wong K and Foote H. 2003. Organic data memory using the DNA approach. Communications of the ACM 46: 95. DOI: https://doi.org/10.1145/602421.602426

Yazdi S M H T, Yuan Y, Ma J, Zhao H and Milenkovic O. 2015. A rewritable, random-access DNA-Based storage system. Nature Scientific Reports 5: 143. DOI: https://doi.org/10.1038/srep14138

Zhirnov V, Zadegan R M, Sandhu G S, Church G M and Hughes W L. 2016. Nucleic acid memory. Nature Mater 15: 366–70. DOI: https://doi.org/10.1038/nmat4594

Downloads

Submitted

2020-03-05

Published

2020-03-06

Issue

Section

Review Article

How to Cite

HAMADANI, A., GANAI, N. A., FAROOQ, S. F., & BHAT, B. A. (2020). Big data management: from hard drives to DNA drives. The Indian Journal of Animal Sciences, 90(2), 134-140. https://doi.org/10.56093/ijans.v90i2.98761
Citation