Preprint on demographic history inference

By | February 15, 2017

Population genetic inference based on the statistics of single loci, as summarized in the allele frequency spectrum, is known to be powerful for inferring demographic history. But substantial additional information is carried in two-locus statistics, which additionally capture patterns of linkage disequilibrium (LD). In this work, we developed a novel framework for demographic inference from two-locus statistics, characterized its power, and applied it to Drosophila melanogaster.

We first developed a novel solution of the two-locus diffusion equation, upon which we built a composite-likelihood inference framework. Applying this framework to simulated data, we found that two-locus statistics are substantially more powerful than single-locus statistics. In particular, the depth and duration of a bottleneck are strongly confounded when single-locus statistics are used in inference, but not when two-locus statistics are used. Moreover, two-locus statistics allow diversity-independent estimation of effective population size. We then applied our approach to a Zambia population of D. melanogaster. In this application, we show that two-locus statistics indeed result in much more precise inference of demographic parameters and much more accurate recapitulation of the observed LD decay. Interestingly, we also infer a substantially lower ancestral effective population size for D. melanogaster than previous works.