DNA Data Analysis
DNA Computational Analysis History
In order to understand why DNA would even be considered in data storage, a few characteristics should be considered:
- DNA is abundantly available, and able to be synthesized by using base pairs;
- DNA is highly reproducible, and while not perfect, can be replicated per desired design to a high accuracy;
- DNA base pairing follows specific rules (using A, T, C, and G bases);
- Use and manipulation of DNA continues to become cheaper (much like Moore’s law for transistors);
- Models for use of DNA in calculations and circuit component behavior could be conceivable, based upon such predictable behavior.
Much of the above five characteristics, when begun to be recognized through recent history, led to consideration of DNA for data encoding, retrieval, and even problem solving. While the semiconductor, the basis for a transistor, relies upon a p-n junction model (see www.RavishOnElectronics.com), it is conceivable that by encoding and realizing limitations of behavior, DNA may be able to do the same. This was the sort of idea that was had by Mikhail Neiman in 1964, published in the Soviet journal Radiotekhnika. Subsequently, over several years, work has been done to both encode information using DNA and retrieve it subsequently. Leonard Adelman of University fo Southern California (1994) noted that DNA could be used to solve the Hamiltonian path problem, and be a type of Turing machine when properly constructed (i.e. giving an output after a fixed algorithm was provided). A molecular computing machine using DNA rather than silicon microchips was used successfully in a programmable computer in 2002 (Israel). Actual construction of circuit design elements have allowed unique constructs, including Boolean variables such as OR, AND, and NOT.
Present-day Applications of DNA Computing
Given new tools for faster sequencing at fractions of previous costs, recent work from Church et al (Harvard, 2012) are able to encode a book of over 53,000 words, several images, and even computer programming. Further efficiencies in encoding have been realized since, as described in the 2013 European work describing 5 millions bits of data representing a “speck of dust.” With accuracy of 99.99% or higher, and with newer methods improving the error rate, DNA may well be going the direction of circuit transistors and Moore’s law – DNA encoding price as of 2013 of $12,400 (and retrieval price of $220) could likely be a small fraction of this in the near future. An interesting effect of using DNA for storage is that recovery (with maintained accuracy) may occur for 2000 years if stored at 10 degrees C (and upto 1 million years if stored at -18 degrees C – illustrating the significant effect of cooling upon maintaining DNA structure and hence accuracy).
The Immediately Imaginable DNA Future
The most readily usable application of DNA computation and analytics may be in the body itself – to this extent, the future may build on work such as that of Shapiro et al at the Weizmann Institute which may intracellularly, in vivo, not only diagnose cancer activity within an cell, but release subsequent treatment. This has yet to be fully realized as a cure to cancer, but one can imagine an if-then statement being created with DNA’s binding capacity (or not) being tailored to specific types of cancer. Further work by others has led to actual construction of entire systems of computation using DNA and associated RNA and enzymes. Certain advantages stressed for “genomic vs. electronic” computers include a changeable architecture, components being constructed as-needed (and changed thereafter), software and hardware being malleable, and use of molecules and ions rather than “hardwires.”
A “DNA abacus” in development by Reif, LaBean, and Winfree (at MIT and Duke) utilizes multi-tile structures in a binary fashion (replacing bases for binary numbers) and potentially performing 10 trillion calculations per second (which would be a million times faster than the present electronic computer, as of 2015).
An entire world of DNA data analysis is very imaginable, which could be effective for not only its small size, but its malleability (or, if allowed to be called so, plasticity). In effect, one could envision that the latest version of a software program (or even a hardware component) could be designed by a set of sequential solutions being applied in a prescribed order, to get the desired final product. There may still be a 55-page iTunes upgrade agreement, but it too may be stored using DNA, RNA, and/or enzymes (only to be replaced by the next version using similar components).