Sheet 1: Endogenous mavirus-like element (EMALE) statistics. This dataset contains information on each of the 138 EMALEs in four Cafeteria burkhardae strains, including their exact location in the host assembly, length, presence of terminal inverted repeats, type score, and Ngaro insertions.
Sheet 2: EMALE integration sites. This dataset lists information for each of the 33 fully assembled EMALEs regarding orthologous integration loci in all four host strains, target site duplications, and host genomic context of the integration loci.
Sheet 3: Ngaro statistics. This dataset contains information for 80 Ngaro retrotransposons identified in the four C. burkhardae assemblies, including their exact location in the host assembly, length, type, and insertion locus (EMALE or eukaryotic chromatin).
Sheet 4: Primer sequences. List of oligonucleotides used as PCR and sequencing primers for the validation of EMALE01 RCC970_016B. Numbers in the last column refer to the sequenced PCR products of the assembly diagram shown in Figure 3—figure supplement 6, starting with number 1 in the top left corner and ending with number 66 in the bottom right corner.