There has been a lot of effort recently to perform whole genome sequencing, for humans and other species. The results yield new frontiers of data analysis that offer a lot of promise for groundbreaking scientific discoveries.
One objective of human genome sequencing has been to identify sources of disease and new therapeutic targets. This movement has opened the door to create personalized medicine for cancer, whereby the genetic makeup of an individual’s tumors can be used to determine the most effective drug intervention to administer.
Interest in studying the characteristics unique to individual cells seems obvious when considering the function of healthy cells versus tumor cells, or brain cells compared to heart cells. What has surprised scientists is the realization that two cells in the same tissue can be more different from each other, genetically, than from a cell in another organ.
For example, a small number of brain cells with a specific mutation can lead to some forms of epilepsy while healthy people may also carry cells with these mutations, but too few to cause disease. The lineage of a cell, where it came from and what events shaped its development, ultimately determines what diseases can exist.
This progression toward the examination of DNA on a more local level was highlighted in a two recent articles in Nature (The trickiest family tree in biology and Single-cell sequencing made simple). Each article discussed an approach to learning about organisms by examining the origin of function or disease at the level of individual cells. This shift from organism-level biology to single-cell biology has a lot of promise, but presents a whole host of methodological challenges.
One approach is to construct a family tree of sorts that captures the origin of every single cell in an organism, from the first cell division. So far, this type of embryonic cell-lineage has only been completed for one organism, Caenorhabditis elegans. The small scale and simplicity of this miniscule roundworm make it possible to construct this cell-lineage tree by simply observing and recording cell divisions through a light microscope. While C. elegans serves as a very worthy experimental model, its development doesn’t involve the chance events and biological ad-libbing that influence embryonic development in more complex organisms.
The variation between embryonic cell lineages and those found in later developmental stages of an organism (or even resulting from repair or regeneration later in life, after development) limit the amount of detail that can be obtained. Although this makes it difficult to produce complete cell-lineage maps, advances in gene-editing and sequencing have made it possible to gather results that have led to some important discoveries about programmed cell death, the role of stem cells against cancer or in tissue regeneration, regulatory RNAs and the origins of tumor cells and their mutations.
Mutations serve as an invisible record of lineage, being passed from a cell to its progeny. One method that holds a lot of promise for mapping cell lineages based on mutations employs the molecular tool du jour, CRISPR-Cas9. In one example, researchers engineered zebrafish embryo genomes to contain a barcode sequence and then designed a CRISPR-Cas9 system to edit this sequence over several generations of cell division. The lineage of each cell could then be determined by tracing the shared mutations present in the barcode sequence of the cells.
Another approach for single-cell analysis is single-cell RNA sequencing (scRNA-seq). Typical “bulk” RNA-seq involves sequencing the RNA of thousands of cells to produce data that represent an average of all the cells in the sample. Utilizing scRNA-seq provides a more detailed look at the cells by parsing out what subtle variations exist in the genes expressed within each individual cell.
The process of bulk RNA-seq is complicated, but there are established procedures that have been developed. In contrast, scRNA-seq techniques are still in their infancy, requiring a lot of trial and error in determining how to analyze the data and extract meaning. Since the amount of RNA is so small, scRNA-seq leaves a lot less room for error; slight variations in amplification can make differences between cells appear that do not actually exist (expressed genes that do not get sequenced or “identical” cells showing differences in expression).
Even though analyzing the differences between single cells still requires a lot of work toward refining the techniques used to gather and analyze data, I’m excited by the possibilities. The outlook is probably very similar to when the scientists first had the ability to generate genomic sequencing data; scientists aren’t exactly sure what they are looking for or what they will find, but whatever they discover is likely to change the landscape of molecular biology.