For sixty years now, scientists have studied the role of DNA as a vehicle for the storage and transmission of genetic information from generation to generation. We have marveled at the capacity of DNA to store all the information required to describe a human being using only a 4-letter code, and to pack that information into a space the size of the nucleus of a single cell. A letter published last week in Nature exploits this phenomenal storage capacity of DNA to archive a quite different kind of information. Forget CDs, hard drives and chips, the sum of human knowledge can now be stored in synthetic DNA strands. The Nature letter, authored by scientists from the European Bioinformatics Institute in Cambridge, UK, and Agilent Technologies in California, describes a proof-of-concept experiment where synthetic DNA was used to encode Shakespeare’s Sonnets, Martin Luther King’s “I Have a Dream” speech, a picture of the Bioinformatics Institute, and the original Crick and Watson paper on the double-helical nature of DNA.
The authors chose DNA because of its high coding capacity, ease of storage, and it’s “proven track record as an information bearer”. So how can DNA store text and images? Basically, the authors devised a code that turned the 1’s and 0’s used to store information in digital files into strings of As, Ts, Cs and Gs. They then synthesized DNA strands of the appropriate sequence to spell out the required information, creating many overlapping copies of each strand. In addition to having numerous copies of each sequence, the authors took care to avoid creating a DNA code with long strings of repeats, which are prone to sequencing errors. They report that they were successfully able to sequence the synthetic DNA and decode the information accurately.
The paper includes discussion of the feasibility of DNA as a long-term data repository, including barriers, pros and cons. The idea of Shakespeare’s sonnets or Martin Luther King’s speech encapsulated in DNA is (to me) a surprising one. But the exquisite simplicity of the DNA code is itself is a kind of poetry, so it is perhaps fitting that it has now been used to encode something like this:
.. Yet, do thy worst old Time: despite thy wrong,
My love shall in my verse ever live young.
Here’s the paper
Goldman, N., et al. (2013) Towards practical, high-capacity, low-maintenance information storage in synthesized DNA. Nature doi:10.1038/nature11875.