Play all audios:
Coaxing ancient DNA to reveal its history delivers surprises and even improves ways of working with badly damaged present-day DNA. You have full access to this article via your institution.
Download PDF MAIN As a species, we may be 100,000 years older than previously thought. A research team believes it has found the earliest known _Homo sapiens_ remains in Jebel Irhoud,
Morocco, which are around 300,000 years old1. Fossils help scientists piece together the ancient history of our and other species. Now that DNA can be extracted from fossils, analyzed with
high-throughput sequencing and reconstructed with computational tools, DNA is becoming a kind of molecular fossil2. “It is the most amazing data in the world right now,” says David Reich, a
population geneticist and evolutionary biologist at Harvard Medical School. In many cases paleogenomics is the only way to truly look at what happened in the past, as opposed to making
highly educated guesses, says Tom Gilbert, an evolutionary biologist at the Centre for GeoGenetics of the Natural History Museum of Denmark, which is affiliated with the University of
Copenhagen. “It's really like going back in time,” he says. Neanderthal DNA was found in the soil at an archaeological site in El Sidrón, Spain. Credit: El Sidrón research team Because
it has been preserved for a long time, ancient DNA (aDNA) can shed new light on genomes, epigenomes and Earth history. Ancient human DNA from between 7,000 and 45,000 years ago has helped
researchers discover striking aspects of European population history3. DNA can be extracted from ancient bones, teeth, hair, eggshells, paleofeces and even soil. In the absence of bones,
researchers sifted through soil in multiple Eurasian Pleistocene-era caves and found ancient human DNA4. When analyzing ancient human DNA, researchers use the human reference genome. With
many organisms, however, even a close living relative may still be evolutionarily distant, says Beth Shapiro, evolutionary biologist at the University of California at Santa Cruz. _Toxodon_,
a hoofed, extinct mammal, is an example where “we don't know what their closest living relative is,” she says. Whether human or _Toxodon_, what's challenging about aDNA is that
it's a mess. INTRIGUING, MESSY ADNA It is hard to extract aDNA efficiently, says Gilbert. Researchers need enough material but must also responsibly minimize the amount of precious
sample they use to generate complete genomes. “It's not an easy balance,” he says. Fossils can contain a low absolute amount of endogenous aDNA, and some fossils have no endogenous DNA
at all, says Qiaomei Fu, a researcher at the Institute of Vertebrate Paleontology and Paleoanthropology in Beijing, where she and her team are currently working on the human prehistory of
Asia. As DNA ages, chemical changes occur that are read as sequence changes, says Matthias Meyer, evolutionary biologist and methods developer at the Max Planck Institute for Evolutionary
Anthropology in Leipzig, Germany. These changes were not well understood until high-throughput sequencing established a pattern: a cytosine that has undergone deamination into uracil is read
as thymine. Over time, ancient DNA breaks into pieces, not with blunt-ended breaks but with sticky, single-stranded overhangs. “And in these short single-stranded overhangs, you have a very
high deamination rate, and so your uracils accumulate particularly at the ends,” says Meyer. It can be helpful to treat aDNA with enzymes such as uracil-DNA glycosylase or similar enzymes
to cut the DNA where the uracils appear, says Gilbert. Uracils accumulate only slowly over time, although the speed is temperature- and water-dependent. The cytosine-to-uracil conversion is
a hydrolytic reaction, so the warmer and wetter it is, the faster uracils will appear, he says. A reconstruction of a 300,000-year-old face based on _H. sapiens_ fossils in Jebel Irhoud,
Morocco. Credit: S. Freidline, MPI EVA “We used to complain about the damage at the ends of the molecules,” says Shapiro, but uracil occurrence is now being used to authenticate aDNA. Reich
calls this pattern of DNA damage “a sanity check.” In a batch of molecules, he and his colleagues restrict analyses to molecules that show such damage. FIGHTING CONTAMINATION Contamination
is a constant threat. “In the past, especially when working with modern human remains, there's always been doubts,” says Meyer. As Gilbert explains, there have been multiple relatively
high-profile examples of problems related to contamination. People contaminated samples when handling a human skeletal element and then performing PCR amplification. This was before the
advent of high-throughput sequencing. Researchers use sterile excavation techniques, careful lab protocols and clean rooms, and perform PCR in rooms separate from where DNA extraction is
performed. “This isn't to say that contamination isn't a problem, but one can at least incorporate the uncertainty associated with it into analyses,” says Gilbert. Typical aDNA
damage patterns help researchers detect contamination, but it's not always easy. The patterns have been established mainly with aDNA from temperate zones and permafrost. Discoveries in
other regions lead to discussions. In 2014, researchers reported the analysis of a skeleton found in a cave on Mexico's Yucatán Peninsula that was estimated to be between 12,000 and
13,000 years old5. After analyzing mitochondrial DNA (mtDNA) from a tooth and bones, they stated that their findings support a link between Paleoamericans and modern Native Americans.
Scientists need to be vigilant about aDNA contamination, says Tom Gilbert. Credit: T.C.G. Bosch Meyer and his colleagues raised doubts about this interpretation because high-throughput
sequence evidence to confirm the presence of aDNA was lacking. The bone or the sampling equipment might have been contaminated with modern Native American DNA that was then amplified with
PCR. In a published exchange, the study authors countered that DNA damage caused by contaminants can take on forms expected of aDNA. In their view, independent replication matters for
authentication in human aDNA studies, and damage patterns across samples from different environments might show an underappreciated variance. Viewpoint differences about these and other
damage patterns remain, but there is agreement about the fact that the majority of aDNA in a bone, for example, is not from the 'owner' of that bone, says Shapiro. Most fossils are
massively contaminated with microbial DNA. “This is irritating, and makes analyses much more costly than they would otherwise be,” says Gilbert. Unless a lab is specifically studying
metagenomes, this issue can be addressed through the use of enrichment techniques pre-sequencing and data-filtering after sequencing. HUMAN DNA ENRICHMENT Microbial contamination led Fu to
work on her “home-brew solution method,” a hybridization enrichment strategy for mitochondrial and nuclear DNA, which she co-developed with Meyer as a PhD student at the Max Planck Institute
in Leipzig6. Enriching highly fragmented aDNA calls for overlapping probes, which limits the size of genomic regions that can be targeted. To address this limitation, the team used
oligonucleotides made on arrays to construct probe libraries and combined them into one 'superprobe library'. This was the main strategy used in their work on Ice Age Europe, which
she co-authored as a postdoctoral fellow in Reich's lab3. The fossil record had revealed that Europe was first populated 45,000 years ago. By assembling and analyzing genome-wide data
from 51 ancient humans between 7,000 and around 45,000 years old, the researchers tracked genetic changes over time. It appears a founding population of ancient humans was dispersed
throughout Europe. This branch disappeared and was replaced through migration. Around 19,000 years ago, at the end of the Ice Age, there was population migration from the area of current
Spain; 14,000 years ago another migrating group arrived, likely from the east. Extraction of aDNA happens in clean rooms. Credit: S. Tüpke, MPI EVA For this work, the teams extracted DNA in
dedicated clean rooms and set up sequencing libraries. Fu brought her 'home-brew' method again to bear: in-solution hybrid capture to enrich for many single-nucleotide
polymorphisms (SNPs)—between 390,000 and 3.7 million of them. The team synthesized oligonucleotides 52 base pairs long that targeted these SNPs and hybridized aDNA to these probes. Without
this strategy, she says, the team would have been unable to obtain DNA from that many individuals from a pool so contaminated with microbial DNA. Such human population studies are
remarkable, says Gilbert about this and other research. Yet, he says, one must bear in mind that labs draw conclusions from imperfect sample sets and the information is restricted by the
available samples. “The more things we look at, the more we can refine our ideas of the past.” Almost on an annual basis, findings and data add to the story of Europe's and
Australia's peopling. “This is natural—it's how science works—but it does mean researchers have a responsibility to be very careful how they word their discoveries, bearing in mind
that their results may not always be set in stone,” he says. Fu says that when working with aDNA, it is also advisable to “always think you are so dirty” and to “always feel you will
contaminate your samples and will easily introduce cross-contamination.” For computational data analysis, she recommends constant speculation about what might be wrong and advises
researchers to check results from different angles and with different methods. Enrichment for mtDNA was also at the heart of the work of a multinational group of scientists that identified
both Neanderthal and Denisovan DNA in soil samples at seven archaeological sites across Eurasia4. The technique presents the possibility of detecting hominin groups at sites that lack
skeletal remains. A typical excavation delivers thousands of animal bones “but you find very, very few human fossils,” says Meyer. The study's first pass was to check the state of DNA
preservation. The team looked for mammalian DNA, expecting and finding much DNA from mammoth, bison and deer bones. “What really shocked us was how much DNA there is in sediment,” he says.
They found trillions of DNA fragments in 50 mg of soil—an amount that fits on the tip of a steak knife. In the analysis, the team used mtDNA as 'bait' to pull out DNA fragments
with similar mtDNA, says Meyer. DNA from primates or great apes but also Neanderthal, Denisovan or archaic human sequence will hybridize to this 'bait'. Sequence analysis then
helped filter the data down to hominin DNA. Previous work has shown that aDNA can be isolated from soil, says Gilbert7, and the limits have mainly been financial. He hopes that as sequencing
costs drop, other groups, too, will be able to try such analyses. Research with aDNA brings ever-new aspects to light about _H. sapiens_. There were Neanderthals, other ancient humans
called Denisovans and early modern _H. sapiens_. Our species diverged from the other two more than 500,000 years ago, but despite the separate lineages, there was dating and mating among the
groups—so-called admixture events. Our current-day DNA, with its traces from our past, shows that we have both Neanderthal and Denisovan DNA from those encounters. Sometimes aDNA analysis
leads to a reassessment of fossils. METHODS SURPRISES Fossils from around 28 hominin individuals dated to be more than 400,000 years old have been excavated in caves in Spain's Sierra
de Atapuerca, where one site is called Sima de los Huesos, the 'pit of bones'. The fossils appear Neanderthal-like, but mtDNA analysis of the highly degraded DNA indicated that the
bones belong to relatives of Denisovans, eastern Eurasian relatives of Neanderthals. In a later study, nuclear DNA analysis showed that these hominin individuals are more closely related to
Neanderthals than to Denisovans8. These fossils probably belong to early Neanderthal ancestors or their close relatives, says Meyer, who adds that the odd mtDNA findings indicate that these
groups' population history is more complex than can be picked up from currently available data. A key technique in this and other work, says Meyer, was single-stranded sequencing
library preparation, in which the two strands of DNA are unzipped and converted into separate libraries. It was developed for an analysis of DNA from a tiny well-preserved piece of bone that
turned out to be from a Denisovan individual. The approach increased the sequencing library yield by an order of magnitude, he says, which was needed because all they had to work with to
reconstruct a high-quality sequence was the tip of a juvenile's finger. This library prep method has changed the way the group works and the types of projects the team can take on, he
says. And it's a method he and his group continue to develop and use to generate reference genomes for Neanderthals, Denisovans and early humans. A single-stranded sequencing library
prep makes the most out of precious little sample. Credit: MPI EVA; E. Dewalt, Nature Research Each cell is likely to have several hundred copies of mtDNA, but the bones in the Spanish caves
showed only traces of DNA. The mtDNA enrichment combined with single-stranded DNA sequencing library prep helped this project, says Meyer. Even though sequencing costs have dropped, it
would be impossible to generate sequence from so many fragments without mtDNA enrichment, says Meyer. Most humans did not die in caves. They left only small traces of organic matter, which
meant that researchers had to isolate DNA from hundreds of samples. Using mtDNA rather than nuclear DNA makes it easier to discern human from the plentiful deer and hyena DNA, he says. The
fragments can then be assembled and compared to reference sequence to determine what is human-like DNA, as opposed to, for example, cave-bear-like. Meyer and colleagues recently applied this
single-stranded library prep method to present-day formalin-fixed samples with quite damaged DNA9. “We see a ridiculous improvement in yield,” he says. They obtained 3,100 times more
molecules than with double-stranded library prep. The gains are due to a number of factors. Classic DNA library prep, with its enzymatic manipulations and purification steps, always leads to
molecule loss. The single-stranded method uses the material more efficiently, essentially doubling the amount of DNA. Gain is also due to the fact that most of the library prep takes place
on solid support. “All the subsequent reactions are taking place on beads,” says Meyer. Buffer and enzymes can be changed without any substantial DNA loss, and DNA-purification steps can be
eliminated. The results with damaged present-day DNA show the role niche methods can play in other fields after originating in an area that some might consider “very peculiar,” he says. The
method is also more efficient at retaining short DNA fragments, which might be as short as 17–20 base pairs with aDNA. That's too short to align unambiguously, but that does not stop
Meyer and others from pushing methods forward and dreaming about possibilities. WISH LIST With fragments less than 30 base pairs long “it really gets tricky,” says Meyer, to discern an
ancient human DNA fragment from, say, microbial DNA, but it makes for an interesting computational challenge awaiting a solution that would greatly help the field (see Box 1, “Computing
aDNA”). Another item on his wish list is aDNA repair, such as ways to fix aDNA breaks. “If you could seal them, you could make your molecules longer again, you could perhaps make more
molecules for analysis,” he says. As scientists learn more about how aDNA decays, new ideas and methods can help them get more DNA and information out of the bones. Shapiro worked on an aDNA
project with results indicating that the team might have found a different form of aDNA damage. They used a sequencing platform by a now-defunct company called Helicos Biosciences.
It's hard to know whether more could have been learned, she says, and “I think there's certainly room to see different forms of decay as the technology for generating the data
improves.” Reich sees multiple frontiers in aDNA research, one of which is the development of higher-sensitivity methods to get more out of very difficult or older samples. “Another frontier
is to automate ancient DNA analysis, to make it more efficient and cheaper,” he says. Over the past two years, Meyer and his colleagues have been using more automation, which has increased
the throughput “at least by a factor of ten, probably more,” he says. The lab's liquid-handling systems handle most of the sample prep pipetting steps. Gilbert sees some labs turn to
robots and automation, but, he says, “whether they are the best solution or not is something I haven't decided on yet.” Qiaomei Fu developed a 'home-brew' hybridization
enrichment strategy for mitochondrial and nuclear aDNA. Credit: Institute of Vertebrate Paleontology and Paleoanthropology, Beijing, China As the costs associated with high-throughput
sequencing drop “we can increasingly afford to consider generating high-coverage ancient genomes,” says Gilbert. That high coverage requires plenty of material, but teams battle the low
efficiency of DNA extraction. And both extraction and sequencing library prep are not only hard but difficult to scale up to the population level. Much DNA has been extracted from fossils,
“but still it's frustrating that there are so many interesting fossils out there where we just can't manage to get any DNA from,” says Meyer. Getting DNA sequences from _Homo
floresiensis_ or _Homo naledi_, for example, is yet a dream. Some labs work to reconstruct epigenomes, for example, by analyzing DNA extracted from ancient bison. The process calls for much
DNA and damages the DNA, too, which makes it not a good choice for aDNA. There are methylation maps of both Neanderthal and Denisovan genomes based on inferences from 'naturally'
occurring base damage and existing data. Ancient human DNA methylation analysis, a kind of ancient gene expression analysis, is possible but for now only with bones, says Meyer. One could
compare gene expression in current-day bone to that in ancient bone. “It would be great to do this from ancient brains,” he says, but unfortunately there is no ancient soft tissue. BOX 1:
COMPUTING ADNA Labs can work on aDNA data with many computational tools and pipelines. “I'd say the biggest computational challenge is for people who want to study ancient
metagenomes—making sense of a mix of true ancient metagenome and modern bacterial contaminants is a very hard problem,” says Gilbert. Classic mapping approaches work well for mapping short
fragments, says Alexander Seitz, an aDNA computational methods developer who is wrapping up his PhD at the University of Tübingen in the bioinformatics lab of Kay Nieselt. When mapping human
DNA, researchers can use tools such as BWA or Bowtie. For the specific traits of aDNA, labs can use EAGER and PALEOMIX, which are computational pipelines that map reads against a known
reference in order to reconstruct aDNA, he says. Tools such as mapDamage and PMDtools can be used to verify that the samples are aDNA and not modern DNA. Identifying genomic rearrangements
or gene loss in aDNA requires _de novo_ assembly methods, which is where the software developed by Seitz and Nieselt comes in. It's called MADAM, for iMproving Ancient DNA AsseMbly.
“When using enrichment, missing genes cannot be identified, as the corresponding DNA fragments were not enriched,” says Seitz. _De novo_ assembly tools for modern DNA samples for which there
is no closely related reference include SOAPdenovo2 and SPADES. But given how short the aDNA fragments are, the team wanted to improve the options by applying de Bruijn graphs in a specific
way. De Bruijn assemblers represent the input reads as a graph with nodes and edges. Each node stands for a _k_-mer, a fixed string that is shorter in length than a read. An edge is added
between two nodes to indicate overlap. For example, if _k_-mer A is ATGCG and _k_-mer B is TGCGA, there will be an edge between _k_-mer A and _k_-mer B. But with aDNA, the choice of fixed
_k_-mer length is challenging because a sample has many different read lengths, says Seitz. MADAM takes a two-layer approach. In a first computational pass, the contiguous sequences are
assembled from short reads in a de Bruijn graph that uses multiple _k_-mers at various lengths. In a second computational step these 'contigs' are combined. REFERENCES * Richter,
D. et al. _Nature_ 546, 293–296 (2017). Article CAS PubMed Google Scholar * Orlando, L., Gilbert, M.T. & Willerslev, E. _Nat. Rev. Genet._ 16, 395–408 (2015). Article CAS PubMed
Google Scholar * Fu, Q. et al. _Nature_ 534, 200–205 (2016). Article CAS PubMed PubMed Central Google Scholar * Slon, V. et al. _Science_ http://dx.doi.org/10.1126/science.aam9695
(2017). * Chatters, J.C. et al. _Science_ 344, 750–754 (2014). Article CAS PubMed Google Scholar * Fu, Q. et al. _Proc. Natl. Acad. Sci. USA_ 110, 2223–2227 (2013). Article CAS PubMed
PubMed Central Google Scholar * Willerslev, E. et al. _Science_ 300, 791–795 (2003). Article CAS PubMed Google Scholar * Meyer, M. et al. _Nature_ 531, 504–507 (2016). Article CAS
PubMed Google Scholar * Gansauge, M.-T. et al. _Nucleic Acids Res._ 45, e79 (2017). Article CAS PubMed PubMed Central Google Scholar Download references AUTHOR INFORMATION AUTHORS AND
AFFILIATIONS * technology editor for Nature Methods, Vivien Marx Authors * Vivien Marx View author publications You can also search for this author inPubMed Google Scholar CORRESPONDING
AUTHOR Correspondence to Vivien Marx. RIGHTS AND PERMISSIONS Reprints and permissions ABOUT THIS ARTICLE CITE THIS ARTICLE Marx, V. Genetics: new tales from ancient DNA. _Nat Methods_ 14,
771–774 (2017). https://doi.org/10.1038/nmeth.4367 Download citation * Published: 01 August 2017 * Issue Date: 01 August 2017 * DOI: https://doi.org/10.1038/nmeth.4367 SHARE THIS ARTICLE
Anyone you share the following link with will be able to read this content: Get shareable link Sorry, a shareable link is not currently available for this article. Copy to clipboard Provided
by the Springer Nature SharedIt content-sharing initiative