Treemix phylogeny including both subsamples of Palmar Chico.
Another potential explanation for lowered sweep sharing between replicates is that sweeps vary in their detectability based on their characteristics. Namely, sweeps that were weakly selected, incomplete, and/or ones that started at a high initial frequency prior to the onset of selection (soft sweeps) may vary in their detectability using the methods we employed. We conducted a simulation experiment to better understand the potential causes of the low shared proportion, and to measure performance to detect different kinds of sweeps more generally. We used discoal (Kern and Schrider 2016) to simulate sweeps in a 400Kb region using the average genome-wide maize mutation and recombination rates under the inferred demographic history of the maize population from Palmar Chico (Figure 2). We simulated four distinct scenarios: classical hard sweeps, where selection acts to fix an an adaptive mutation; soft sweeps, where selection is initiated after the adaptive allele reaches a specified frequency; and incomplete sweeps, where a hard sweep simulation is stopped at a specified frequency, and neutral simulations without selection. For soft sweeps, we varied the initial by drawing from a beta distribution with shape parameters 1 and 20. Incomplete sweeps finished when the adaptive allele reached a frequency of 0.5. For all three types of sweeps, we also varied the strength of selection using the parameter α = 4N0s to be 10, 50, or 100, where N0 is the present day effective population size and s is the selection coefficient. In addition to matching demography and other parameters, we used the same sampling scheme, simulation 50 individuals, than randomly choosing two non-overlapping subsets of 10 individuals (https://github.com/silastittes/ms_sub). From the simulations we assessed the True/False Positive/Negative Rates for each combination of sweep type and strength of selection (α), as well as the distribution of base pair overlap between sweep regions inferred in the the two random subsamples. The same sweep inference methods and parameters were used for these simulations and the empirical samples (see methods). Overall, we found that sweep characteristics we explore indeed impacted our overall power to detect them, and the about of overlap between the sweep regions. Namely, weakly selected sweeps had consistently lower True Positive Rates (Table S3).