1. Evolutionary Biology
  2. Genetics and Genomics
Download icon

Comment on 'Single nucleus sequencing reveals evidence of inter-nucleus recombination in arbuscular mycorrhizal fungi'

  1. Benjamin Auxier  Is a corresponding author
  2. Anna Bazzicalupo
  1. Wageningen University, Netherlands
  2. Montana State University, United States
  • Cited 0
  • Views 576
  • Annotations
Cite this article as: eLife 2019;8:e47301 doi: 10.7554/eLife.47301

Abstract

Chen et al. recently reported evidence for inter-nucleus recombination in arbuscular mycorrhizal fungi (Chen et al., 2018a). Here, we report a reanalysis of their data. After filtering the data by excluding heterozygous sites in haploid nuclei, duplicated regions of the genome, and low-coverage depths base calls, we find the evidence for recombination to be very sparse.

https://doi.org/10.7554/eLife.47301.001

Introduction

For many years arbuscular mycorrhizal fungi (AMF) were presumed to be asexual as no one had witnessed sexual structures in these fungi. This was puzzling because AMF retain core meiosis genes (Halary et al., 2011), indicating that a meiosis-like process most likely occurs in this lineage. Previous evidence for recombination later turned out to be based on duplicated gene copies (Croll and Sanders, 2009), or ribosomal RNA sequences that were paralogs (Pawlowska and Taylor, 2004; Maeda et al., 2018). Recently, based on work comparing single nuclei whole-genome sequences to bulk sequencing data, new evidence for recombination in these fungi was reported (Chen et al., 2018a). The isolates were dikaryotic, containing nuclei of two classes defined by their mating type (MAT) locus (Ropars et al., 2016). For each sequenced nucleus, PCR-amplification was attempted to assign a mating type class (MAT-1 up to MAT-5). Recombination was then inferred based on: (i) base-pair calls classed as one mating type found in the alternate mating type; (ii) nuclei of the same mating type showing variation in consecutive blocks of single nucleotide polymorphisms (SNPs); (iii) SNPs from nucleus 7 (SL1 strain) being more similar to SNPs of the alternate mating type, consistent with a recombination event spanning the MAT locus.

Here, we ask how strong the signal of within-strain recombination was in the data from Chen et al. (2018a) if we excluded heterozygous sites in haploid nuclei, duplicated regions of the genome, and low-coverage depths base calls. By removing data that cannot confidently be distinguished from sequencing errors and repeated regions, we find that the evidence for recombination is very sparse. We also report specific examples of these possible errors to justify our more stringent filtering of the data.

Results

The effect of filtering positions based on reads

Our first analysis was of the dataset reported in Supplementary file 6 of Chen et al. (2018a), used for both Figures 2 and 3 of that manuscript. We filtered out: (1) positions where any single nucleus was heterozygous (defined as sites with read depth >10, and alternate allele >10%), (2) any individual site with less than five reads coverage, and (3) positions with more than one high-confidence BLAST hit using the settings specified in Chen et al. (2018a). We applied the filters individually or in combination, to see how many of the variable sites inferred as signals of recombination would be removed. Applying each filter individually removed between 19–77% of recombined positions. Filtering of low coverage sites had the strongest effect, a ~ 75% reduction. Applying all three filters together removed 91% of recombined sites (Figure 1). Notably, these filters had much less effect on the total number of analyzed sites, with the combined application of all three filters reducing the total number of sites by only ~22%.

Figure 1 with 1 supplement see all
Filtering SNP data shows a decrease of 91% in number of recombined sites, but only ~20% decrease in total sites.

Left panel shows the effect of filtering on all sites included in Supplementary file 6 of Chen et al. (2018a), while right shows the effect on the number of recombined positions. Recombined positions identified based on second criterion in Figure 1—figure supplement 1. Different symbols show the effect on the three different strains (A4, A5, and SL1) used in our re-analysis.

https://doi.org/10.7554/eLife.47301.002

Recombination, when involving crossing over, exchanges physical blocks between homologous chromosomes resulting in consecutive allelic differences. We calculated the number of recombined sites, as well as the number of recombined blocks, consecutive SNPs of the alternate haplotype, shown in Table 1 (details of identified blocks can be found in Table 1—source data 1). Based on the analysis presented in Chen et al., all isolates show recombined blocks, in some cases spanning over a thousand base pairs. However, applying our three filters reduced these blocks. After filtering no consecutive SNPs remained for strain A4. Filtering applied to strains A5 and SL1 reduced the number of recombined blocks to 2 and 4, respectively, and also reduced the length of the remaining blocks.

Table 1
Lengths of recombination events before and after additional filtering.
https://doi.org/10.7554/eLife.47301.004
Original dataFiltered data
Isolate (Mating types)Number of recombined positions*Number of recombined blocks (>1 consecutive SNP recombined)Number of SNPs of longest recombined block
(length in bp)
Number of recombined positions*Number of recombined blocks (>1 consecutive SNP recombined)Number of SNPs of longest recombined block
(length in bp)
A4
Mat-1/Mat-2
54/314335 (1131)0/1401 (1)
A5
Mat-3/Mat-6
41/1831822 (2145)2/1826 (670)
SL1
Mat-5/Mat-1
111/1122016 (1872)22/946 (429)
  1. *numbers before/after the slash separate the two mating types, listed in the leftmost column. Number calculated based on the criteria shown in Figure 1—figure supplement 1A. Table 1—source data 1 contains a list of all the recombined blocks identified.

A specific example of repeated regions associated with biallelic sites in haploid nuclei

Chen et al. (2018a) interpreted consecutive SNPs differing between nuclei of the same mating type as a sign of recombination, as highlighted in Figure 3 of their article. The filtering used in Chen et al. (2018a) removed multi-copy sites ‘[i]f BLAST results returned more than two good hits’, but retained regions with two BLAST hits. This could lead to the inclusion of SNPs that are heterozygous due to duplicated regions of the genome.

To show how repeated regions may lead to a false signal of recombination, we focused on an example highlighted in Figure 3 of Chen et al. (2018a), and discussed in the main text of that article. We found that several positions on scaffold 70 from isolate A4 were heterozygous in several nuclei (Figure 2), although they are treated as homozygous in Chen et al. (2018a). High sequencing depth (>30) eliminates rare sequencing errors as the cause. To test if duplicated regions could be the cause of the heterozygosity, we performed a BLAST search against the A4 reference genome with sequences from scaffold 70:100354–100657. This search resulted in two BLAST matches: the self match on scaffold 70, as well as an additional match on scaffold 3570 (Figure 2B). When the short reads of the dikaryon (bulk sequencing of all nuclei) were aligned to the reference genome, both these SNPs on scaffold 70 and their match on scaffold 3570 were heterozygous, and the BLAST hit result for both showed 100% identity match. Thus, this repeated sequence seems to have been assembled as a chimera of the two variants in both scaffolds, and the short reads from either copy are mapped equally to both.

We feel this example illustrates the need to exclude repetitive regions from analyses of recombination.

Heterozygous positions in single nuclei from a duplicated region were treated as homozygous in Chen et al. (2018a) and reported as evidence for a long stretch of recombination.

Panel (A) shows the base calls for positions on scaffold 70 for six nuclei (four nuclei of mating type M-1 and two of M-2, as indicated in the top row). Each base call was assigned to a mating type class (green or yellow) in Chen et al. (2018a) based on an unspecified criterion. Variation between nuclei of the same mating type (e.g. variation among nuclei 2, 21, 22, and 24) is interpreted as recombination. We used their Illumina reads to show the ratio of reads supporting alternative nucleotides for each position. For example, in strain 4, mating-type 1, nucleus 2, position 100454, in Chen et al. the base was called as a G with a mating type ‘green’, and 51 of 52 reads matched G. However, for nucleus 21, only 85 of 151 reads supported an A at that position, while the other 66 supported a G. Panel (B) shows the alignment of the region shown in (A) with its best BLAST hit region on scaffold 3570. Heterozygous sites in the mapped reads of the dikaryon are indicated in bold, with the two alternate and reference bases shown slightly above/below. Gray boxes surround those sites included in (A). Note that both regions are heterozygous at the same aligned sites, and with the same alternate base for each heterozygous site. Graphic of (A) modified from Figure 3 of Chen et al. (2018a), with the addition of nuclei 21 and 24.

https://doi.org/10.7554/eLife.47301.006

Confidence in low coverage sites to infer recombination

The data presented in Chen et al. (2018a) was filtered with a minimum of two reads. This is a very low threshold, and insufficient even for a consensus in the event of disagreeing reads. To look at the effect of low coverage on the signal of recombination, we compared the distribution of read depths between random and recombined sites. We first needed to identify recombined sites, as the method was lacking from the original manuscript, so we applied a parsimony criterion as detailed in Figure 1—figure supplement 1. While imperfect, this method certainly underestimates recombination, as it cannot identify recombined sites when equal numbers of nuclei within a mating type have alternate genotypes. Our method identified 733 positions, sufficient for analysis. Looking at the distribution of read depths, overall SL1 nuclei had ~95% fewer high coverage sites (average of 97 sites > 10 read depth for nuclei from SL1 versus 2290 for A4 and 2441 for A5) compared to A4 and A5 (Figure 3). We note here, as described in Table 1 of Chen et al. (2018a), that SL1 nuclei cover much less of the genome (14%) compared to A4 and A5 (53% and 42%, respectively). Another fact visible from Figure 3 is that, for A4 and A5, recombined sites are overrepresented by sites with low depth compared to sub-sampled non-recombined sites (Wilcoxon ranked sum test A4; p=3.2×10−09, A5 p=2.9×10−6). We note that for nuclei from isolate SL1, fewer overall recombined sites can be identified since the decreased breadth of coverage reduces overlap between nuclei, making it difficult to say whether this pattern of excess low-coverage sites is also present (p=0.11).

Recombined sites are overrepresented for low coverage sites.

Top row: Distribution of sub-sampled read depths for non-recombined sites of individual nuclei for the three isolates, showing decreased coverage overall for SL1. Bottom row: Distribution of read depths for recombined sites does not mirror the distribution of random sites. Sites identified as recombined based on the parsimony criterion diagrammed in Figure 1—figure supplement 1. Note that read depth is plotted on a log scale.

https://doi.org/10.7554/eLife.47301.007

Genome-wide pairwise SNP differences are reduced after filtering

We then assessed the evidence for genome-wide recombination based on pair-wise SNP differences. This is the analysis presented in Figure 3 of Chen et al. (2018a), showing overall more recombination in SL1 than A4 and A5, indicated by a ‘mosaic pattern’. We again note here that sequence from SL1 nuclei covers very little of the assembly (average of 14% from Table 1 of Chen et al., 2018a) means that very few positions will be covered between any two nuclei between nuclei of SL1 (14% in one nucleus x 14% in the other nucleus = 2% in both). After applying our filters, we find that in A4 and A5 almost all differences between nuclei of one mating type disappear (Figure 4). For nuclei from SL1, the filtering reduced the differences within a mating type, but since these nuclei cover so little of the genome, the overall dataset is reduced such that on average nuclei only share 9 SNPs that can be compared. Many nuclei have no overlapping SNPs and no comparison can be made (black squares in Figure 4). A few nuclei of opposite mating types, such as nuclei 17 and 25, show high similarity, but for these pairs the similarity is based on only one or two shared SNP positions.

Figure 4 with 1 supplement see all
Genetic similarity of SL1 is strongly affected by SNP filtering methods.

(A) left column panels show heatmaps generated from original data presented in Supplementary file 6 of Chen et al. (B) Right panels show data after filtering. Black squares represent pairs that do not share any SNPs. (C) Average number of SNPs in the dataset from Chen et al. (2018a) and after filtering. Note that SL1 has the lowest number of SNPs due to low sequencing breadth. (D) Average number of pairwise overlapping SNPs per nucleus, note that after filtering nuclei from SL1 have fewer than 10 SNPs overlapping on average, again due to low sequencing breadth.

https://doi.org/10.7554/eLife.47301.008

Confirming the placement of nucleus 07 (SL1) could be strong evidence for recombination

In Figure 2 of Chen et al. (2018a), nucleus 07 of SL1 shows a strong similarity with nuclei of the alternate mating type, seen by the clustering of nucleus 07 with MAT-5. However, this nucleus was PCR-genotyped to be of mating type MAT-1. The incongruence between PCR-genotyped mating type and the SNP clustering would be evidence of recombination spanning the MAT locus. To confirm this, we looked in the mapped reads of each nucleus to find reads mapping to either alternate mating type locus (Figure 4—figure supplement 1]). We found that there were no Illumina sequencing reads of nucleus 07 mapped to either mating type locus, indicating the whole genome amplification step may not have amplified the mating locus. As such, we have no available evidence for the mating type of this nucleus. Without corresponding Illumina evidence, we consider the PCR product the sole remaining evidence of this recombination event. This PCR experiment that is not confirmed with Illumina data represents the only remaining evidence for recombination after read filtering. We find cross-contamination of the PCR to be a more likely scenario in the face of many billions of sequenced bases from an Illumina run.

Conclusion

Finding a balance between filtering poor data and losing informative data is a critical component of any analysis. For this dataset, we provide evidence for the necessity of stringent filtering to avoid inferences based on erroneous or misleading data. We do not consider our filters to be particularly strict, as removing low coverage sites, repeated regions, and heterozygous sites from haploid data is commonplace in genomic analyses. In SL1, three blocks of consecutive SNPs remained after our filtering, and two regions in A5. Some of these regions likely remain because our heterozygosity filter requires a minimum of ten reads, thus low coverage heterozygous sites are not excluded. This analysis used the first 100 contigs, covering approximately 10 Mb. As such, finding only a handful of putative recombined SNPs certainly cannot be confidently separated from amplification/sequencing noise. While small blocks of genetic exchange may be compatible with gene conversion, the limited number of markers involved greatly increases the difficulty in identifying high-confidence gene conversion events (Wijnker et al., 2013; Qi et al., 2014).

Mapping recombination inside repetitive regions would require longer reads than available from standard short-read technologies. This is a formidable task due to the input requirements of PacBio technologies, but it has been accomplished using linked short reads from individual pollen cells (Sun et al., 2019). Notably, the use of a more contiguous reference genome will actually include additional repetitive regions, and the exclusion of repetitive regions will become even more important. As the apparent recombined blocks are much smaller than the contigs, a more contiguous genome assembly would not change our analysis. Single nucleus genome amplification with multiple displacement amplification produces extremely variable genome coverage. Normalization of the data was shown to improve the quality of AMF genomic data when coverage is variable (Montoliu-Nerin et al., 2019). In addition to normalization, low coverage sites could still be used with a model-based approach, which incorporates the associated uncertainty (Hinch et al., 2019; Bloom et al., 2013). Finally, removing heterozygous sites from haploid single-nuclei seems like a self-evident requirement.

The conserved meiosis genes found in the genomes of Glomeromycotan species strongly suggests a meiosis-like process allowing recombination and re-shuffling of genetic material among genomes. Uncovering the details of this process would be a major scientific breakthrough. Given this importance, claims made regarding recombination in Glomeromycota require rigorous examination. While we acknowledge that the models presented in Chen et al. (2018a) are valuable framework to test hypotheses for meiosis-like mechanisms found in these fungi, the data presented are not robust enough to support or reject them. As such, we believe that one of the greatest remaining mysteries in mycology remains unknown.

Materials and methods

Raw data

Request a detailed protocol

We obtained the SPAdes assemblies for the three dikaryotic R. irregularis isolates from NCBI, as well as the paired-end read libraries from the dikaryons and the paired-end reads of the single nuclei. Details of the accession numbers used are found in Chen et al. (2018a).

Read processing

Request a detailed protocol

Short reads were cleaned with FASTP (Chen et al., 2018b), then aligned using BWA mem as in Chen et al. (2018a). Reads per nucleus were analyzed using the python modules pysam (Heger and Jacobs, 2019, and BLAST searches were scripted using biopython (Cock et al., 2009). Scripts used are available on GitHub repository (https://github.com/BenAuxier/Chen.2018.ResponseAuxier, 2019; copy archived at https://github.com/elifesciences-publications/Chen.2018.Response). The parsimony criterion shown in Figure 1—figure supplement 1A. for identification of recombined sites, was performed manually. Filtering of excel files and calculations of recombined sites was performed with the R statistical language, which was also used to prepare plots with ggplot2 and distance matrices using ape (Wickham, 2016; Paradis and Schliep, 2019).

We compared the coverage representation between non-recombined and recombined sites. We subsampled from non-recombined sites to match the number of reads in recombined sites and performed Wilcoxon ranked sum test in R between the subsampled set of non-recombined reads and the recombined reads.

Mating-type loci identification and mapping

Request a detailed protocol

As the location of the mating type loci was not specified in the results of Chen et al. (2018a), we identified them based on data from Ropars et al. (2016). Specifically, we used the primers sequences kary001, kary002, and kary003 as query sequences for BLAST searches against the A4, A5, and SL1 genomes. Each primer sequence had strong matches against two different scaffolds, consistent with divergent ideomorphs as found in Ropars et al. (2016). As these primers only target a small region, we extended the locus using the annotations found on NCBI to identify the boundaries of the pair of genes. These locations are reported in Figure 4—figure supplement 1—source data 1.. As no annotations are available for SL1, we used the entire sequence of the ideomorph on scaffold511 of A5 as a BLAST query to identify the homologous regions.

With the mating locus identified, we then counted the number of reads that mapped to each sequence using samtools (Li et al., 2009).

Calculation of overlapping SNPs

Request a detailed protocol

To calculate the expected number of overlapping SNPs found in Figure 4D, we used the following formula:

(numberofbasescalledforisolateinsupplementaryfile6numberofpositions)(1numberofnuclei)=Proportioncoveredpernucleus
Proportion covered per nucleus2=Expected pairwise overlap proportion between nuclei
Expected pairwise overlap proportionnumber of positions=Expected number of shared positions

References

  1. 1
  2. 2
  3. 3
  4. 4
  5. 5
  6. 6
  7. 7
  8. 8
  9. 9
  10. 10
  11. 11
  12. 12
  13. 13
  14. 14
  15. 15
  16. 16
  17. 17
  18. 18
  19. 19

Decision letter

  1. Raphael Mercier
    Reviewing Editor; INRA, France
  2. Patricia J Wittkopp
    Senior Editor; University of Michigan, United States

In the interests of transparency, eLife includes the editorial decision letter and accompanying author responses. A lightly edited version of the letter sent to the authors after peer review is shown, indicating the most substantive concerns; minor comments are not usually included.

Thank you for submitting your article "Comment on 'Single nucleus sequencing reveals evidence of inter-nucleus recombination in arbuscular mycorrhizal fungi'" for consideration by eLife. Your article has been reviewed by three peer reviewers, and the evaluation has been overseen by Raphael Mercier as the Reviewing Editor, Patricia Wittkopp as the Senior Editor, and Peter Rodgers, the eLife Features Editor. The reviewers have opted to remain anonymous.

We invite you to submit a revised version of your manuscript that addresses the comments of reviewer #2 and reviewer #3 (please see below; reviewer #1 did not raise any points for you to address). Please also submit a point-by-point response to these comments.

Reviewer #2:

The authors present a rather convincing case for a re-appraisal of the data and conclusions of Chen et al., 2018. However, there are two major issues that should be addressed before publication.

1) More information is needed in order for readers to be able to correctly interpret the data presented in Table 1 and to relate it to the parsimony methodology presented in Figure 1—figure supplement 1. Is the number of recombined positions as defined in Figure 1—figure supplement 1B? If so, what do the numbers before and after the slash in the table entry represent? The number of recombined positions out of the total number of positions? If so, then is the total number of positions counted as only those positions where all nuclei were able to be genotyped at that position? If that is the case, then why are these numbers different from the number of shared SNPs presented in Figure 4C? And why would the number before the slash be greater than the number after the slash (as it is for SL1 after filtering)? For "recombined blocks", does this refer to what is shown and described as "sites" in Figure 1—figure supplement 1A? More information is needed to describe the criteria used for this analysis. I can imagine three possible haplotype situations that could be considered as recombinant sites according to what is presented in Figure 1—figure supplement 1A: (a) all contiguous SNPs match the haplotype assigned to the opposite mating type, (b) only one of the contiguous SNPs matches the other haplotype, and c) multiple SNPs match the other haplotype.

These scenarios all have different likelihoods of representing true recombinant events. For example, scenario (a) only represents a true recombination event if the mating type is correctly assigned (as later discussed by the authors). For scenario (b), it is rather difficult to assign this to a recombination event, especially if it is only one SNP in the middle of a series of contiguous SNPs that would otherwise match the haplotype associated with its mating type or just a single SNP on that scaffold. Here, the most probable explanation might be a mutation. Scenario (c) has the strongest likelihood of representing a true recombination event, but only if there is a switch where there are contiguous SNPs from one haplotype to contiguous SNPs from another haplotype. Multiple haplotype switching would require multiple recombination events, which seems unlikely given that double crossovers are unlikely to occur within the short sequence lengths of the reported blocks in Supplementary file 2. It is not clear to me which of these scenarios is included in the parsimony criterion for sites. For positions, the attribution of SNP differences between mating types to recombination over other processes is unclear. In Figure 1—figure supplement 1B, for positions 5 and 6, only one nucleus is (D) is different from the others. This could arise through recombination or through mutation, (especially if it is only one SNP among many on a scaffold – which is not diagrammed in the example shown in Figure 1—figure supplement 1B, but could have been scored in the data). If it is several contiguous positions, then recombination is more likely than several independent mutations that would give rise to the other haplotype. The authors should clarify this in the text and/or by revising Figure 1—figure supplement 1.

2) My other major concern is the analysis presented in Figure 3 and associated commentary. The authors state "Another fact visible from Figure 3 is that for A4 and A5, recombined sites are overrepresented by sites with a read depth of 2 compared to random sites". I agree that it looks that way in the figure, but I would like to see some statistics to support this claim. The number of random sites is much, much greater than the number of recombined sites, further complicating a simple visual comparison. I suggest that the authors subsample the number of random sites to match the number of recombined sites and display this in the plot. Also, would "non-recombined" sites be a more appropriate term than random sites? The legend of this figure says that recombined sites are identified as recombined based on the parsimony criterion diagrammed in Figure 1—figure supplement 1. Is the read depth plotted for each individual SNP in a recombined site? Or is the average depth across all SNPs in a recombined site? The discrete values suggest that it is the former or that is actually referring to recombinant positions rather than sites.

Reviewer #3:

Auxier and Bazzicalupo express their concerns regarding an earlier study on inter-nucleus recombination in arbuscular mycorrhizal fungi, and tested their concerns regarding the data analysis by filtering the presented evidence for recombination using information on repetitive regions/heterozygous sites as well as short read alignment coverage information. In general, the search for rare events in the genome cannot be performed with relaxed filters, as false positives signal (even it is low) will affect rare events much more than frequent events.

Figure 1 is absolutely convincing and clearly shows that the new filtering more strongly affects the sites that are annotated as recombined sites, as compared to the sites that are not recombined. This technical difference between recombined sites and non-recombined sites can only be true if these two sets of sites are technically different. There is no reason why recombination should lead to such a difference. This is true for the repeat analysis/heterozygous site analysis as well as for the low coverage analysis. In consequence, I do share the concerns regarding the finding of the original study.

In consequence, I fully agree with the last statement of the manuscript: even though the new filtering does not remove all signals that could be recombination-induced, the (very little) remaining signal for recombination "are not robust enough to support or reject" meiosis-like mechanisms.

Moreover, I would not even agree that the remaining evidence for recombination, the PCR based assessment of the mating type of nucleus 07 (SL1), is a strong evidence for recombination. It is a non-replicated experiment of a single event (as far as I understand) and thus does not meet the general criteria to support a conclusion of such large impact.

My only concern regarding the comment:

Even if stated in the original manuscript, I would not agree that longer conversion tracts support recombination events more than single marker conversions. Recombination does not necessarily exchange long tracts, for example gene conversion-like events could be very short (as for example shown in Arabidopsis where the majority of gene conversions is only supported by single markers). If gene conversion like mechanisms would act here, short tracts might be the expected pattern. Therefore, I do not see why the absence of long tracts should be prominently illustrated.

https://doi.org/10.7554/eLife.47301.014

Author response

Reviewer #2:

The authors present a rather convincing case for a re-appraisal of the data and conclusions of Chen et al., 2018. However, there are two major issues that should be addressed before publication.

1) More information is needed in order for readers to be able to correctly interpret the data presented in Table 1 and to relate it to the parsimony methodology presented in Figure 1—figure supplement 1. Is the number of recombined positions as defined in Figure 1—figure supplement 1B? If so, what do the numbers before and after the slash in the table entry represent? The number of recombined positions out of the total number of positions? If so, then is the total number of positions counted as only those positions where all nuclei were able to be genotyped at that position? If that is the case, then why are these numbers different from the number of shared SNPs presented in Figure 4C? And why would the number before the slash be greater than the number after the slash (as it is for SL1 after filtering)? For "recombined blocks", does this refer to what is shown and described as "sites" in Figure 1—figure supplement 1A?

Apologies for the confusion, we forgot to add the associated text to the table legend. The number before the slash is for nuclei from one mating type, and after the slash is for nuclei of the opposite mating type, with the mating types corresponding to the mating types listed in the leftmost column. These numbers differ from Figure 4C as Table 1 only includes sites identified by the criteria outlined in Figure 1—figure supplement 1, while Figure 4C is for all SNPs, between all nuclei.

More information is needed to describe the criteria used for this analysis. I can imagine three possible haplotype situations that could be considered as recombinant sites according to what is presented in Figure 1—figure supplement 1A: (a) all contiguous SNPs match the haplotype assigned to the opposite mating type, (b) only one of the contiguous SNPs matches the other haplotype, and c) multiple SNPs match the other haplotype.

It is unclear what contiguous SNPs refer to in this context. The SNPs identified by Chen et al. are rarely contiguous, as there are many invariant bases between SNPs. To clarify, no cases were identified by Chen et al. where all the SNPs on a single contig were of the opposite mating type.

To hopefully clarify further, our criteria were used to per SNP identify either recombined sites in an individual nucleus, or positions where recombination is supposed to have occur, without determining which nuclei were recombinant.

We understand that these criteria a not intuitive, but the authors of Chen et al. were either unwilling or unable to provide the criteria that they used for Figure 3 of their manuscript, and the original manuscript does not explain how colors in Figure 3 were assigned. Any analysis of the underlying data requires an objective set of criteria, which Chen and co-authors failed to provide.

These scenarios all have different likelihoods of representing true recombinant events. For example, scenario (a) only represents a true recombination event if the mating type is correctly assigned (as later discussed by the authors). For scenario (b), it is rather difficult to assign this to a recombination event, especially if it is only one SNP in the middle of a series of contiguous SNPs that would otherwise match the haplotype associated with its mating type or just a single SNP on that scaffold. Here, the most probable explanation might be a mutation. Scenario (c) has the strongest likelihood of representing a true recombination event, but only if there is a switch where there are contiguous SNPs from one haplotype to contiguous SNPs from another haplotype. Multiple haplotype switching would require multiple recombination events, which seems unlikely given that double crossovers are unlikely to occur within the short sequence lengths of the reported blocks in Supplementary file 2.

We agree with this logic, and we note that all of the claimed recombination events would involve double crossovers, as the haplotypes revert back to the original mating type in every case. Short stretches of this could indeed represent mutation, or alternatively gene conversion as emphasized by reviewer #3.

It is not clear to me which of these scenarios is included in the parsimony criterion for sites. For positions, the attribution of SNP differences between mating types to recombination over other processes is unclear. In Figure 1—figure supplement 1B, for positions 5 and 6, only one nucleus is (D) is different from the others. This could arise through recombination or through mutation, (especially if it is only one SNP among many on a scaffold – which is not diagrammed in the example shown in Figure 1—figure supplement 1B, but could have been scored in the data). If it is several contiguous positions, then recombination is more likely than several independent mutations that would give rise to the other haplotype. The authors should clarify this in the text and/or by revising Figure 1—figure supplement 1.

We agree with the reviewer and have tried to clarify our criteria linguistically. Additionally, we have modified Figure 1—figure supplement 1A to represent how we interpret singletons. We point to Figure 3 of Chen et al.’s original manuscript, where they use examples of singletons as well as contiguous regions.

2) My other major concern is the analysis presented in Figure 3 and associated commentary. The authors state "Another fact visible from Figure 3 is that for A4 and A5, recombined sites are overrepresented by sites with a read depth of 2 compared to random sites". I agree that it looks that way in the figure, but I would like to see some statistics to support this claim. The number of random sites is much, much greater than the number of recombined sites, further complicating a simple visual comparison. I suggest that the authors subsample the number of random sites to match the number of recombined sites and display this in the plot.

We tested the conclusion of Figure 2 using the Wilcoxon ranked sum test. The statistical results are consistent with our interpretation.

Results:

“Another fact visible from Figure 3 is that, for A4 and A5, recombined sites are overrepresented by sites with low depth compared to non-recombined sites (Wilcoxon ranked sum test A4; p=3.2x10-09, A5 p=2.9x10-6). We note that for nuclei from isolate SL1, fewer overall recombined sites can be identified since the decreased breadth of coverage reduces overlap between nuclei, making it difficult to say whether this pattern of excess low-coverage sites is also present (p=0.11).”

Materials and methods:

“We compared the coverage representation between non-recombined and recombined sites. We subsampled from non-recombined sites to match the number of reads in recombined sites and performed Wilcoxon ranked sum test in R between the subsampled set of non-recombined reads and the recombined reads.”

Also, would "non-recombined" sites be a more appropriate term than random sites?

The original non-subsampled analysis did not exclude identified recombined sites, but we have excluded recombined sites from the subsampled analysis. We have changed the wording of Figure 3 accordingly.

The legend of this figure says that recombined sites are identified as recombined based on the parsimony criterion diagrammed in Figure 1—figure supplement 1. Is the read depth plotted for each individual SNP in a recombined site? Or is the average depth across all SNPs in a recombined site? The discrete values suggest that it is the former or that is actually referring to recombinant positions rather than sites.

Yes, it is the read depth per nucleus. We have clarified the legend of the figure.

Reviewer #3:

Auxier and Bazzicalupo express their concerns regarding an earlier study on inter-nucleus recombination in arbuscular mycorrhizal fungi, and tested their concerns regarding the data analysis by filtering the presented evidence for recombination using information on repetitive regions/heterozygous sites as well as short read alignment coverage information. In general, the search for rare events in the genome cannot be performed with relaxed filters, as false positives signal (even it is low) will affect rare events much more than frequent events.

Figure 1 is absolutely convincing and clearly shows that the new filtering more strongly affects the sites that are annotated as recombined sites, as compared to the sites that are not recombined. This technical difference between recombined sites and non-recombined sites can only be true if these two sets of sites are technically different. There is no reason why recombination should lead to such a difference. This is true for the repeat analysis/heterozygous site analysis as well as for the low coverage analysis. In consequence, I do share the concerns regarding the finding of the original study.

In consequence, I fully agree with the last statement of the manuscript: even though the new filtering does not remove all signals that could be recombination-induced, the (very little) remaining signal for recombination "are not robust enough to support or reject" meiosis-like mechanisms.

Moreover, I would not even agree that the remaining evidence for recombination, the PCR based assessment of the mating type of nucleus 07 (SL1), is a strong evidence for recombination. It is a non-replicated experiment of a single event (as far as I understand) and thus does not meet the general criteria to support a conclusion of such large impact.

We agree, but without any direct evidence against we feel the need to give the authors the benefit of the doubt.

My only concern regarding the comment:

Even if stated in the original manuscript, I would not agree that longer conversion tracts support recombination events more than single marker conversions. Recombination does not necessarily exchange long tracts, for example gene conversion-like events could be very short (as for example shown in Arabidopsis where the majority of gene conversions is only supported by single markers). If gene conversion like mechanisms would act here, short tracts might be the expected pattern. Therefore, I do not see why the absence of long tracts should be prominently illustrated.

We discuss the length of tracts because they are highlighted by Chen et al. in their Results section:

“In many cases, recombining genotypes encompass hundreds to thousands of base pairs, (Figure 3, Supplementary file 6). […] In this example, a single recombination event between genotypes harbored by the nuclei 22 (MAT-1) and 19 (MAT-2) resulted in a genetic exchange involving at least one thousand base pairs, and similar events are found elsewhere in the genome of A4.”

While we agree that gene conversion could result in the transfer of single markers, Chen et al. refer to “meiotic-like processes” in the Abstract of their publication. If indeed a meiotic-like process is occuring then gene conversion should be paired with at least one crossover event per chromosome for proper segregation. It is possible that crossover events are only occurring on the distal ends of chromosomes, as in Agaricus bisporus, and these distal ends are not included in the largest 100 scaffolds. But there is no evidence presented for this scenario and we prefer not to speculate on potential mechanisms to explain Chen et al.’s low quality data.

In consideration of the reviewer’s and editors comments, we have added a sentence to the conclusion acknowledging that gene conversion events would be of a size consistent with the size found by Chen et al., however gene conversion events are extremely difficult to confidently identify, as shown by Qi et al. who we have added as a reference as well as Wijnker et al.

“While small blocks of genetic exchange may be compatible with gene conversion, the limited number of markers involved greatly increases the difficulty in identifying high-confidence gene conversion events (Wijnker et al., 2013; Qi et al., 2014).”

https://doi.org/10.7554/eLife.47301.015

Article and author information

Author details

  1. Benjamin Auxier

    Laboratory of Genetics, Wageningen University, Wageningen, Netherlands
    Contribution
    Conceptualization, Data curation, Formal analysis, Visualization, Writing—original draft, Writing—review and editing
    For correspondence
    ben.auxier@wur.nl
    Competing interests
    No competing interests declared
    ORCID icon "This ORCID iD identifies the author of this article:" 0000-0002-7743-0610
  2. Anna Bazzicalupo

    Department of Microbiology and Immunology, Montana State University, Bozeman, United States
    Contribution
    Writing—original draft, Writing—review and editing
    Competing interests
    No competing interests declared
    ORCID icon "This ORCID iD identifies the author of this article:" 0000-0001-5845-9517

Funding

The authors declare that there was no funding for this work.

Acknowledgements

We thank Dr. Duur Aanen, Dr. Anna Rosling, Dr. Marisol Sanchez-Garcia, Dr. Erik Wijnker, and Mathijs Nieuwenhuis for critical feedback. We also thank the three anonymous reviewers.

Senior Editor

  1. Patricia J Wittkopp, University of Michigan, United States

Reviewing Editor

  1. Raphael Mercier, INRA, France

Publication history

  1. Received: April 9, 2019
  2. Accepted: October 9, 2019
  3. Version of Record published: October 25, 2019 (version 1)

Copyright

© 2019, Auxier and Bazzicalupo

This article is distributed under the terms of the Creative Commons Attribution License, which permits unrestricted use and redistribution provided that the original author and source are credited.

Metrics

  • 576
    Page views
  • 26
    Downloads
  • 0
    Citations

Article citation count generated by polling the highest count across the following sources: Crossref, PubMed Central, Scopus.

Download links

A two-part list of links to download the article, or parts of the article, in various formats.

Downloads (link to download the article as PDF)

Download citations (links to download the citations from this article in formats compatible with various reference manager tools)

Open citations (links to open the citations from this article in various online reference manager services)

Further reading

    1. Evolutionary Biology
    Julia Fischer et al.
    Feature Article
    1. Chromosomes and Gene Expression
    2. Evolutionary Biology
    Antoine Hocher et al.
    Research Article