Regulatory genome annotation of 33 insect species

  1. Hasiba Asma
  2. Ellen Tieke
  3. Kevin D Deem
  4. Jabale Rahmat
  5. Tiffany Dong
  6. Xinbo Huang
  7. Yoshinori Tomoyasu
  8. Marc S Halfon  Is a corresponding author
  1. Program in Genetics, Genomics, and Bioinformatics, University at Buffalo-State University of New York, United States
  2. Department of Biology, Miami University, United States
  3. Department of Biochemistry, University at Buffalo-State University of New York, United States
  4. Department of Biomedical Informatics, University at Buffalo-State University of New York, United States
  5. Department of Biological Sciences, University at Buffalo-State University of New York, United States
8 figures, 5 tables and 1 additional file

Figures

Figure 1 with 1 supplement
The SCRMshaw method and analysis pipeline.

(A) Supervised motif-blind cis-regulatory module (CRM) discovery (SCRMshaw). (a) SCRMshaw uses a training set of known D. melanogaster enhancers (‘training sequences’), drawn from REDfly, that are defined by common functional characterization, and a 10‐fold larger background set of similarly sized common functional characterization, non‐enhancer sequences (‘background sequences’). (b) The short DNA subsequence (kmer) count distributions of these sequences are then used to train a statistical model. Note that although the pictured example shows 5-mers, kmers of different sizes are used for some of the underlying statistical models (see Methods). The trained model (c) is used to score overlapping windows in the ‘target genome’. (d) High-scoring regions are predicted to be functional regulatory sequences (asterisks). Figure adapted from Asma and Halfon, 2021. (B) The workflow used for the regulatory genome annotation described in this paper. The left side shows pre-processing steps, the right side, post-processing. Input to SCRMshaw consists of the genome sequence and gene annotation. A protein sequence annotation is supplied later for the orthology mapping step. Final results are made available as part of the REDfly regulatory annotation knowledgebase.

Figure 1—figure supplement 1
Revised post-processing method used for SCRMshaw.

See Methods for details.

Figure 2 with 1 supplement
Annotation of 33 insect genomes.

(A) Genomes from five insect orders were annotated in this study (more are ongoing). (B) Percentage of genes with Drosophila orthologs as mapped via our orthology pipeline (see Methods), for the 15 mapped species. ‘No Mapped Fly Orthologs’ indicates that our orthology mapping pipeline did not identify clear D. melanogaster orthologs. For any given gene, this could reflect either a true lack of a respective ortholog, or failure of our procedure to accurately identify an existing ortholog. For complete species names, see Table 1. (C) Total SCRMshaw predictions for each species. For each species, the left-hand column shows cumulative results for each SCRMshaw sub-method summed over each of the 48 training sets. The right-hand column shows the number of unique predictions after merging overlapping predictions from both sub-methods and training sets. Species are displayed alphabetically by taxonomic order (see also Figure 2—figure supplement 1). (D) Size distribution of SCRMshaw predictions, prior to merging overlapping predictions but after removing outlier predictions >2 kb in length. Species are ordered identically to panel C.

Figure 2—source data 1

SCRMshaw training sets used in this study.

https://cdn.elifesciences.org/articles/96738/elife-96738-fig2-data1-v1.xlsx
Figure 2—source data 2

Number of predicted enhancers for each species, by method.

https://cdn.elifesciences.org/articles/96738/elife-96738-fig2-data2-v1.xlsx
Figure 2—figure supplement 1
Lengths of predicted enhancers, including long outliers.

Size distribution of SCRMshaw predictions, prior to merging overlapping predictions but without removing outlier predictions. Species are ordered as in Figure 2.

SCRMshaw makes multiple predictions per locus.

The number of SCRMshaw predictions per locus (y-axis) are shown as boxplots for loci falling within the given size ranges (x-axis). Black boxes cover the 25–75th percentiles, bars indicate median values and dots indicate values exceeding 1.5 times the interquartile range (boxes are not visible for all bins due to very low degrees of variation). Values in pink represent expected values drawn from randomization, while values in blue represent observed values from SCRMshaw. All values are from results with the training set ‘mapping1.visceral_mesoderm’; results from other training sets were similar (see Figure 3—source data 1). Shown are results from the genomes of (A) D. melanogaster, (B) C. pipiens, and (C) A. aegypti representing small, medium, and large genomes, respectively.

SCRMshaw predicts cis-regulatory modules (CRMs) in orthologous loci across species.

(A) The number of loci in common that contain at least one SCRMshaw prediction, for 10 or more species. (B) z-scores demonstrating that the number of loci in common with one or more SCRMshaw predictions is significantly higher than expectation, based on 360 randomizations. The small number of common predictions for 14–16 species make these statistics unreliable. Dotted lines indicate z-score values representing significance at the (unadjusted) p<0.005 and p<0.05 levels. (C) Fold enrichment values illustrating the excess of loci in common with one or more SCRMshaw predictions compared to expectation. Dotted line shows 1.5× enrichment.

Figure 4—source data 1

Real and simulated counts of predictions in orthologous loci.

https://cdn.elifesciences.org/articles/96738/elife-96738-fig4-data1-v1.xlsx
Previously described gene expression and enhancer activity for select D. melanogaster sequences predicted by SCRMshaw.

The left-hand column shows native D. melanogaster gene expression in imaginal discs (green), while the right-hand column shows described enhancer activity (magenta). Gray shading indicates that expression has not been described. Moving clockwise from the left side of each panel are the wing, haltere, leg, and eye-antennal discs. The enhancers whose activities are described in the table are: (B) ex_BCDE (Wang and Baker, 2018), (F) hth_GMR46D04 (Jory et al., 2012), (H) Ubx_GMR39A02 (Jory et al., 2012), (J) psq_GMR41E12 (Jory et al., 2012).

Figure 6 with 2 supplements
Reporter gene expression for tested ex, klu, and ush predicted enhancer sequences.

Each row shows expression for the indicated construct in (i) wing discs, (ii) haltere discs, (iii) T1 (prothoracic) leg discs, (iv) T2 (mesothoracic) leg discs, (v) T3 (metathoracic) leg discs, and (vi) eye-antennal discs (with eye portion to the left). Positive results were obtained by the enhancers associated with the ex locus of D. melanogaster (wing, legs, antenna, A) and A. mellifera (wing, haltere, legs, C); the klu locus of D. melanogaster (wing, haltere, legs, D) and A. mellifera (legs, arrows in F); the ush locus of T. castaneum (wing, arrow, H (i); eye, H (vi)) and A. mellifera (legs, arrows I (iii, iv, v); eye, I (vi)). Enhancer activities were visualized by UAS-tdTomato that was included in the reporter construct. Scale bar is 50 µm for each column.

Figure 6—figure supplement 1
Open chromatin data for predictions in the ex, klu, and ush loci.

SCRMshaw predictions (red bars) are shown for D. melanogaster (A, D, G), T. castaneum (B, E, H), and A. mellifera (C, F, I) at the ex (A–C), klu (D–F), and ush (G–I) loci. For D. melanogaster and T. castaneum, open chromatin profiles are also indicated, for a variety of tissues and timepoints (see text). Vertical gray bars highlight regions chosen for in vivo validation.

Figure 6—figure supplement 2
Additional expression observed in selected transgenic reporter lines.

(A–E) Control lines using a regulatory-inactive mock enhancer sequence demonstrate that there is no default expression in the imaginal discs. (A) Wing disc, (B) haltere disc, (C, D) leg discs, (E) eye-antennal disc. (F–I) Pupal expression. Arrows indicate expression in the legs (F–H) or notum (I). For T. castaneum enhancer Tc_ex_9p0, expression was observed in pupal legs but not larval leg imaginal discs (G, compare with Figure 6B). (J) Expression can be observed in migrating adepithelial cells of the wing disc in Tc_Ubx_19p9. (K) Expression can be seen in the peripodial membrane surrounding the eye-antennal disc in several lines, including Tc_ush_6p8 (see also Figures 6 and 7).

Figure 7 with 2 supplements
Reporter gene expression for tested hth, Ubx, and psq predicted enhancer sequences.

Each row shows expression for the indicated construct in (i) wing discs, (ii) haltere discs, (iii) T1 (prothoracic) leg discs, (iv) T2 (mesothoracic) leg discs, (v) T3 (metathoracic) leg discs, and (vi) eye-antennal discs (with eye portion to the left). Positive results were obtained by the enhancers associated with the hth locus of T. castaneum (the notum portion of the wing disc, legs, B); the Ubx locus of A. aegypti (legs, C), T. castaneum (legs, D) (myoblast cells in the wing disc, arrows, E (i); legs, eye, E (iii–vi)), and A. mellifera (wing, haltere, legs, F) (wing, haltere, G); the psq locus of T. castaneum (legs, eye, I). Enhancer activities were detected by the G-TRACE system; magenta represents direct enhancer activity detected by dsRed expression, while green indicates lineage-based GFP expression. Scale bar is 50 µm for each column.

Figure 7—figure supplement 1
Open chromatin data for predictions in the hth, Ubx, and psq loci.

SCRMshaw predictions (red bars) are shown for D. melanogaster (A–C), A. aegypti (D–F), T. castaneum (G–I), and A. mellifera (J, K) at the hth (A, D, G), Ubx (B, E, H, J), and psq (C, F, I, K) loci. For D. melanogaster and T. castaneum, open chromatin profiles are also indicated, for a variety of tissues and timepoints (see text). Blue bars indicate the positions of known enhancers. Vertical gray bars highlight regions chosen for in vivo validation.

Figure 7—figure supplement 2
Reporter gene expression observed in embryos.

Panels (A–S) are putative enhancers cloned into piggyPhiGUGd and crossed to G-TRACE. Panels (T–Y) have the putative enhancer inserted into piggyPhiGUGd-TomatoI. All embryos are stained using anti-dsRed antibodies and the ABC-HRP kit and shown with anterior to the left. (A–D) A non-enhancer control sequence was inserted into piggyPhiGUGd to serve as a control for vector-related expression. Limited segmentally repeated expression is observed starting around stage 11 (A) and can be observed in the epidermis and/or peripheral nervous system at stage 14 (B). (C) Expression is observed in the caudal (longitudinal) visceral mesoderm (arrow, ‘cvm’). (D) In older embryos, strong expression is observed in the proventriculus (‘pv’) as well as in migrating hemocytes (individual cells observed throughout the embryo). This ‘default’ expression pattern was also observed in enhancer lines Aa_hth_35p9 (E), Aa_psq_21p5 (F), Tc_Ubx_17p4 (G), and Tc_Ubx_19p9 (H). Note that not all stages are shown for each of these genotypes, as expression was highly similar for each line at all stages. While this expression was seen in the remaining reporter lines too, additional activity, which may therefore represent specific enhancer activity, was also observed. (I) Tc_psq_19p7 had prominent hindgut (‘hg’) expression. (J) Am_psq_29p2 had strong activity in the ventral nerve cord (‘vnc’). (K) Tc_hth_15p5 displayed epidermal expression following germ band retraction; note that the caudal visceral mesoderm expression can also be observed (L). (M–O) Aa_Ubx_26p0 had reporter gene expression throughout the mesoderm (‘me’) starting around stage 10 (M) and persisting throughout embryogenesis. ‘cb’, cardioblasts. (P) Am_Ubx_0p39, in addition to the default visceral muscle expression, was active in the ventral nerve cord (‘vnc’) and salivary glands (‘sg’). (Q–S) Am_Ubx_37 p2 showed expression in the epidermis from stage 11 on, and also in the salivary glands. (T–Y) Although we do not have a no-enhancer control line for piggyPhiGUGd-dTomatoI, all lines generated using this vector had reporter gene activity in the caudal visceral mesoderm (‘cvm’), the salivary glands (‘sg’), and in segmentally repeated epidermal cells (arrowheads). We therefore view this expression as possible vector-specific expression (not all lines shown). However, two lines had additional, unique activity: Am_ex_20p3 (X) was active in the ventral nerve cord, and Am_klu_20p2 (Y) was active in narrow stripes in the epidermis (arrowheads). These activities may therefore be enhancer-specific.

Summary of in vivo validation results.

Results are shown for D. melanogaster, T. castaneum, A. mellifera, and A. aegypti. Black, positive for expression; gray, negative for expression; blue, expression observed in pupae but not in larvae.

Image credits: The insect silhouettes were obtained from The Noun Project (https://thenounproject.com/), artist Georgiana Ionescu, under a Creative Commons CC-BY 3.0 license.

Tables

Table 1
Species used in this study.
Scientific nameCommon nameOrderAssembly version/ annotation versionLink/URL for assembly information
Acyrthosiphon pisumPea aphidHemipteraracon and v3_wdelCourtesy Jennifer Brisson (University of Rochester)
Aedes aegyptiYellow fever mosquitoDipteraL5.2https://www.ncbi.nlm.nih.gov/datasets/taxonomy/7159/
Agrilus planipennisEmerald ash borerColeopteraApla_2.0https://www.ncbi.nlm.nih.gov/datasets/taxonomy/224129/
Anopheles gambiaeAfrican malaria mosquitoDipteraP4.9https://www.ncbi.nlm.nih.gov/datasets/taxonomy/7165/
Aphis gossypiiCotton aphid/melon aphidHemipteraASM401081v2https://www.ncbi.nlm.nih.gov/datasets/taxonomy/80765/
Apis melliferaHoney beeHymenopteraHAv3.1https://www.ncbi.nlm.nih.gov/datasets/taxonomy/7460/
Atta cephalotesLeafcutter antHymenopteraA.ceph_1.0https://www.ncbi.nlm.nih.gov/datasets/taxonomy/12957/
Atta colombicaLeafcutter antHymenopteraAcol1.0https://www.ncbi.nlm.nih.gov/datasets/taxonomy/520822/
Bombyx moriSilkwormLepidopteraASM15162v1http://ensembl.lepbase.org/Bombyx_mori_asm15162v1/Info/Index
Cimex lectulariusBed bugHemipteraClec_2.1https://www.ncbi.nlm.nih.gov/datasets/taxonomy/79782/
Culex pipiens pallensNorthern house mosquitoDipteraTS_Cpip_V1https://www.ncbi.nlm.nih.gov/datasets/taxonomy/42434/
Culex quinquefasciatusSouthern house mosquitoDipteraVIPSU_Cqui_1.0_pri_paternalhttps://doi.org/10.1093/gbe/evab005
Danaus plexippusMonarch butterflyLepidopterav3http://ensembl.lepbase.org/Danaus_plexippus_v3/Info/Index
Drosophila ananassaeFruit flyDipteracaf1http://ftp.flybase.net/genomes/Drosophila_ananassae/
Drosophila melanogasterFruit flyDipterar6 1.8https://www.ncbi.nlm.nih.gov/datasets/taxonomy/7227/
Drosophila mojavensisFruit flyDipteracaf1https://www.ncbi.nlm.nih.gov/assembly/GCF_000005175.2/
Drosophila pseudoobscuraFruit flyDipterar3http://ftp.flybase.net/genomes/Drosophila_pseudoobscura/dpse_r3.03_FB2015_01/
Formica exsectaWood antHymenopteraASM365146v1https://www.ncbi.nlm.nih.gov/datasets/genome/GCF_003651465.1/
Heliconius eratoRed postman butterflyLepidopterav1http://ensembl.lepbase.org/Heliconius_erato_lativitta_v1/Info/Index
Heliconius melpomenePostman butterflyLepidopteraHmel2http://ensembl.lepbase.org/Heliconius_melpomene_melpomene_hmel2/Info/Index
Heliconius himeraFalse postman butterflyLepidopteraHed.V1Courtesy Robert Reed (Cornell University)
Junonia coeniaCommon buckeye butterflyLepidopteraJC v1.0Courtesy Robert Reed (Cornell University)
Leptinotarsa decemlineataColorado potato beetleColeopteraLdec_2.0https://www.ncbi.nlm.nih.gov/datasets/taxonomy/7539/
Manduca sextaTobacco hornwormLepidopterav1.0http://ensembl.lepbase.org/Manduca_sexta_msex1/Info/Index
Nezara viridulaSouthern green stink bugHemipterahttps://www.ncbi.nlm.nih.gov/datasets/taxonomy/85310/
Nasonia vitripennisJewel waspHymenopteraPsr_1.1https://www.ncbi.nlm.nih.gov/datasets/taxonomy/7425/
Onthophagus taurusDung beetleColeopteraOtau_2.0https://www.ncbi.nlm.nih.gov/datasets/taxonomy/166361/
Pseudoatta argentinaLeafcutter antHymenopteraASM1760752v1https://www.ncbi.nlm.nih.gov/datasets/taxonomy/621737/
Stomoxys calcitransStable flyDipteraStomoxys_calcitrans-1.0.1https://doi.org/10.1186/s12915-021-00975-9
Tribolium castaneumRed flour beetleColeopterar5.2https://www.ncbi.nlm.nih.gov/datasets/taxonomy/7070/
Trichoplusia niCabbage looperLepidopteratn1https://www.ncbi.nlm.nih.gov/datasets/taxonomy/7111/
Vanessa carduiPainted lady butterflyLepidopteraVcar_v1Courtesy Robert Reed (Cornell University)
Nebria riversiGround beetleColeopterav1https://doi.org/10.1111/1755-0998.13409
Table 2
Overlap of SCRMshaw predictions with FAIRE-seq and ATAC-seq peaks.
SpeciesTraining setProfiled tissueOverlap(real)Overlap (random)s.d.*z-scoreFE
D. melanogasterhaltere_discWing, leg, and haltere third instar discs and pharate appendages; eye-antennal disc; third instar CNS41.82%14.02%10.6521.532.98
D. melanogasterblastoderm.mapping1Blastoderm40.99%13.86%12.3522.562.96
D. melanogastermapping2.wingWing, leg, and haltere third instar discs and pharate appendages; eye-antennal disc; third instar CNS35.14%10.69%10.1923.013.29
T. castaneummapping2.wingEmbryo, larval thoracic epidermis, larval brain85.58%26.18%10.9828.883.27
T. castaneummapping2.ectodermEmbryo, larval thoracic epidermis, larval brain81.68%25.48%12.4826.293.21
T. castaneummapping1.ventral_ectodermEmbryo, larval thoracic epidermis, larval brain81.17%24.76%12.5633.873.28
T. castaneummapping1.dorsal_ectodermEmbryo, larval thoracic epidermis, larval brain71.15%24.69%12.6424.072.88
T. castaneummapping1.ectodermEmbryo, larval thoracic epidermis, larval brain69.53%24.66%11.9222.352.82
A. gambiaeembryonic_midgutAdult midgut, salivary_gland40.20%11.66%7.9021.563.45
A. gambiaemapping1.salivary_glandAdult midgut, salivary_gland36.76%11.43%8.5220.553.22
D. plexippusmapping2.wingLarval forewing, hindwing, head60.43%30.35%6.308.931.99
H. himeramapping2.wingLarval forewing, hindwing, head68.34%41.10%8.9710.271.66
J. coeniamapping2.wingLarval forewing, hindwing, head3.06%34.91%8.52–12.220.09
V. carduimapping2.wingLarval forewing, hindwing, head15.46%36.01%8.01–7.130.43
  1. *

    Standard deviation.

  2. FE, fold enrichment.

Table 3
Gene loci chosen for in vivo validation.
D. melanogasterT. castaneumA. melliferaA. aegypti
NameSymbolFlyBaseIDRefSeqiBeetleBaseRefSeqVectorBase
expandedexFBgn0004583gene-LOC657053TC012545gene-LOC551519AAEL001437
u-shapedushFBgn0003963gene-LOC659918TC013689gene-LOC100577801AAEL020615
klumpfusskluFBgn0013469gene-LOC103312803TC002783gene-LOC100577692AAEL013544
homothoraxhthFBgn0001235gene-HthTC008629gene-LOC552079AAEL011643
pipsqueakpsqFBgn0263102gene-LOC660343TC003349gene-psqAAEL021255
UltrabithoraxUbxFBgn0003944gene-UbxTC000903gene-ubxAAEL014032
Table 4
SCRMshaw predictions chosen for in vivo validation.
CoordinatesMax. scoreTraining set(s)Method
Set 1
T. castaneum
Tc_ex_9p0NC_007424.3:11221530..112221709.04mapping2.wingimm
Tc_ush_6p8NC_007420.3:5968840..59693706.84haltere_discimm
Tc_klu_8p6NC_007418.3:10416700..104173008.59mapping2.wingimm
D. melanogaster
Dm_ex_20p02L:442110..44281020.04mapping2.wingimm
Dm_klu_16p13L:10991040..1099170016.06mapping2.wingimm
Dm_ush_16p42L:531250..53206016.40mapping2.wingimm
A. mellifera
Am_ex_20p3NC_007075.3:7545750..754644020.36mapping2.wingimm
Am_klu_20p2NC_007070.3:720220..72087020.25mapping2.wingimm
Am_ush_20p8NC_007080.3:10609550..1061020020.80haltere_discimm
Set 2
T. castaneum
Tc_hth_15p5NC_007422.5:13408990..1340985015.52mapping2.winghexmcd,imm,pac
Tc_psq_19p7NC_007418.3:1805000..180625019.71haltere_disc, mapping2.winghexmcd,imm,pac
Tc_Ubx_19p9NC_007417.3:8137250..813800019.32haltere_dischexmcd
Tc_Ubx_17p4NC_007417.3:8153250..815403019.95mapping2.winghexmcd,imm
A. aegypti
Aa_hth_35p91:149733960..14973471035.96mapping2.wingimm
Aa_psq_21p52:228190750..22819164021.50disc.mapping2hexmcd,imm
Aa_Ubx_26p01:309747490..30974839026.08mapping2.wingimm
A. mellifera
Am_psq_29p2NC_007078.3:10712000..1071275029.26mapping2.winghexmcd
Am_Ubx_37p2NC_007085.3:2921250..292225037.18haltere_disc, mapping2.winghexmcd,imm
Am_Ubx_0p39NC_007085.3:2967750..29682500.39mapping2.wingpac
Appendix 1—key resources table
Reagent type (species) or resourceDesignationSource or referenceIdentifiersAdditional information
Gene (Drosophila melanogaster)expanded (ex)FlyBaseFBgn0004583
Gene (D. melanogaster)u-shaped (ush)FlyBaseFBgn0003963
Gene (D. melanogaster)klumpfuss (klu)FlyBaseFBgn0013469
Gene (D. melanogaster)homothorax (hth)FlyBaseFBgn0001235
Gene (D. melanogaster)pipsqueak (psq)FlyBaseFBgn0263102
Gene (D. melanogaster)Ultrabithorax (Ubx)FlyBaseFBgn0003944
Gene (Tribolium castaneum)gene-LOC657053iBeetleBaseTC012545
Gene (T. castaneum)gene-LOC659918iBeetleBaseTC013689
Gene (T. castaneum)gene-LOC103312803iBeetleBaseTC002783
Gene (T. castaneum)gene-HthiBeetleBaseTC008629
Gene (T. castaneum)gene-LOC660343iBeetleBaseTC003349
Gene (T. castaneum)gene-UbxiBeetleBaseTC000903
Gene (Apis mellifera)gene-LOC551519RefSeq
Gene (A. mellifera)gene-LOC100577801RefSeq
Gene (A. mellifera)gene-LOC100577692RefSeq
Gene (A. mellifera)gene-LOC552079RefSeq
Gene (A. mellifera)gene-psqRefSeq
Gene (A. mellifera)gene-ubxRefSeq
Gene (Aedes aegypti)AAEL001437VectorBase
Gene (A. aegypti)AAEL020615VectorBase
Gene (A. aegypti)AAEL013544VectorBase
Gene (A. aegypti)AAEL011643VectorBase
Gene (A. aegypti)AAEL021255VectorBase
Gene (A. aegypti)AAEL014032VectorBase
Antibodyanti-dsRed (Rabbit polyclonal)ClontechCat#632496(1:500)
Recombinant DNA reagentpiggyPhiGUGd (plasmid)Deem et al., 2024, PMID:38698030
Recombinant DNA reagentpiggyPhiGUGd-TomatoI (plasmid)Deem et al., 2024, PMID:38698030
Sequence-based reagentDm_ex_20p0This paperTTCCCAGAACAAACTTGTGGGGGGTGATTAGGTTTGGCAACAAAATATTTTGCTAGTATTCCCTAATCATTTTTTTGAGTGAACCAAACTCGAAGAGCTCTACTCCCCTGGCCATCCACTTGTTGCCACTTCCATTCCAGCTTTGCGTCGACGACGTCGTCATTGATAGGCACTTATTCGGCCGCTGATGATTATTATGATATTGTAGCTGCTGCTGCTGTGTTGTGGATTCGATGCTGAGGTGCCTCTATTCCATGGCCTCCTTCAACCTGCCTGCCTGCTTTTTTCATAATTATTATTTTTCATCTTGCTGCTCTTCATTTTGTATGCAGGAATTCCAATTTTTCGTTCGATGAAGTGTGTGTTGATTTCGCTGTTGTTTTTTCTCTGCCTTCTCGAGCACCGCCGACATGCCCTTGGGCCCTTCTGCTTGGCTCGGGTCGGAGCTATGTAGCGCGGTCCGGTACCGGTCTCGTCTTCGAGCATCAGGCAATGGGCCTCTGACAACCTGACGTGTCGTCATCATCATCGTCTTCATCTGCTGGAGTCTCTGACTCTTGTTGATGTCAATGGGTTGCTTGTTTATTGCCTGACAAACTGACAGAAGTCTGGTCGGGGTCTCGATCCGATTTGAGCCCGATTTGGGACGCAAGAGGAGCGCTCCCTCTTGCATAGCCGAAAGTTCATTTAAAATTTTGAT
Sequence-based reagentDm_klu_16p1This paperGACCAGGCTGTTGCAGTTTCGTGTTGAAACCAGTTTGAATATATTTATTTTTATTTCCTGCGTCCCCTTCCCAATTTCTGTGGCCCTTTTAGGCGCCTCAGTTAGTCGGCAACGATAAGGCGGCAATGGTTTAATTTAGCTGCACCAGCGGCAGCAGCAGATGACGACAGGATCGTTGGGCCGGTCTACGTGCAACAGAAGTTGCTGCGCCGGCAGAAGCAACGGCAGCGGCAGCAACAGCAACAGCAAAACAAATGTGTCTGTATATCGCAGCTAAATTGACTTTGATCACGCGATCCCGAATCCCCCCCCCCATTTGGTCCGAGTTATATGGCCGATTCCAGGTTGCAGGCTCCAGGTTCTCGGGCGCGGCCTTTTGTGGCACAAACGGAAGTATGCTAAGCAACTTGTTGCTGCCGCAAAGGCAAAGCAGCAAAAGCAGCTGAAGGTGTATATTGCAAAAATAATTACATTTGATTGTAAAAGGCCAGCGTCTCTAGGCTGGGGACTCGAGATCGGACCTCGGCCTGCCAGAGAAAAATGTGCAACATGATTGCAGTTTACAGCCCCAGCAGCAGCAGCAGCAGCTAAAGCAGCAACAACAACAGCAGCAGCAGCAGCAAAACCAACAGATAAAATGCATTTACAATTGAATTTGAA
Sequence-based reagentDm_ush_16p4This paperGAATGTTGCTGCGGTGGCATGTTGTTGCTCGTCGAAGTTCAGCCGATGTTGCTGCTGCTGCTGCAGCTGCTGTTGCTGCCATTCCCACTATCAACCGATGGTAATCGAAGGAAGCGACATTTATGCAAATGCCAGGTTGTTTAATAAACGCAAATTATGAGCCCGGCAGCAACATGTTGCAGCAACAGTCGATGGCAGATTAGCGACATTCATACTTGCACTTGGGTCAATTTAAATTTGTGCAACAGTGGCAGCACGGCACGGCAGCAACTCCTTGCCGCAGCAGCAGCCGCTCCAGCAGCACATGAGATTGTGGGAGCAACAGGCAGCATTATTGTGTTCGGCCAAGATCGCAATTGATCAGTGTGTTGGTGCTGGTGTTGGTGTTGCAGTTGCAGTTGCAGTTGCAGTTGCAGTCGCTGTTGCTGTTGCTGTTGCTACCCGACGACAACAGTTGCTGCTGTGCTGGTGCTAGTGCTTGTGCTTGTGCTTGTGCTTGTGTTGCTGCTGATCAAGCGATCAAGCACCGCAGCCAAAAACAATCGGCGCTGAGCGTGCTCACACGAAATTTTCAAGTACTGCGACAATTTCCATGCCCCCAGCCGCTGCCTTGTTATCAGCGCGCCATGCAACAGCAACAGCGACTGCAGCACAGGCAACAGCAGCAACACATCTCAACAGTTGCATCAATTGCTCAACATTGAACTCTGGGCATGGGCCCAGACGATCACCCCTCCTGGGGACCCCCTTCGGTCGCCCCTGCCCCAGTCCCTTGTATCATTTGCACGTTTTTTAATTAAGACATCAAAA
Sequence-based reagentTc_ex_9p0This paperATAGTTCTAAAGTTCTAAACTATTTGCAAATGTAAACAACGACCGACATTTTCAACATGTTCGGGTGTACGTCGCTTTGAATATGGAAAATGTGATTTTGTAGAGAAATTTGGTGGCGTAGCCGTTGCCAGCTCCAGTTTCTTAAGGCAGGCATCTGGTAGCGCGCATCACAGCAGGCCGGGCCAGGTCAATAAAAAATCGAGTCAGCCGTGGGCTAAAAAACGCGACTAAAAAAACAATGCAGAGCCGCGGTTAAAGACAGGTAGCCGAGCTAAGCGGTAGGGGAGAGGGGGATCAGGATCATTTGTATACTCGGGGAATGTGCCCGACCCGGTACACTCGATGCCAAAACGAACACGCCGACTTATACAGATGCCTATCTGACAATACGGACACTTTTAAAAAACACTTTTTCGGTTTTTAAAAGTTAAAGCCAACAAGCGCTGTGTTTTACACAATTCTTCCAATTTCCATTCCCAAGTTGCAAAAGTGAAACGTCCCAAAATATTTTGTCGCGCAAATCAAAAGTATATTTTATTCATGAGGAGCTCGTGTAATTTTTATGTAAATTTTAATTATTGATAACAAGGGACCATTGTTTGACGACACTTTCTTCGGCATGCGGAGCTCCTTGTTTTGT
Sequence-based reagentTc_klu_8p6This paperTTATGAGTTTGGTTATCGTCGGAATCGGGCATTCTTCCTTTGTATCCGATTTTTAGGTAACAAGCTAGAAATTCCAGAGCTACACCTACGATCCATCAAGTCGGAGCCGTTCTAATTGGCCGCTCCTATCTATCGTCTGAAGGAGGCAGCAAGCAGCACACGAACACCTGGCTGCCAATGCACCTGATGCGGTCGTCCGTCGCTGCTATCAACTCACAATGTTTACTTGTGCTTACGCCAAAATTATGACGAATATTAATGCGGCCTCGTACGCAGGCAGGCATGCGGCGTAATTACTACCGGAGGGACCTTATCTCGAATTATTATGCAGCAAGCAGCAGTGAAAAATAGCGCAACGCCTGCTGCACTCATCCAGATCTGACAAAGATGAAGACGTCGCTGACATCTTTATGATTGTGCTTTTATTGCACCTTTTCGCGGAATTCCGACCATTCGAAGCGACCTTTGCGCTACGGAGAAGAGGAAATTTACACCGGGAGTTGACTTATGATGGGAGAACCACTCTCAACGAACGCAACTACTTTCCAGGAATATGAGAAGAGTGCTTAACTGACAAGTCCAAATTCGAACTTGAGGTTA
Sequence-based reagentTc_ush_6p8This paperTAAATAATCCAGACATGCATTGCATGTAAGTATCAGAGATACACGGTAAGAGTGGAGCTTTTCGAGAATCCGGAAACCGATCAGATAAGTTCTGAAAATGACTCGTCCACGAATAGATTTAGGATCGGAGCTGTTTTCCATTCGCCGAGATAAGTCGATAAGTTTCAATAAGTCCGAGTTCTGGCAACAGCCAGCACGGTACGGGCTGCCGCCGTCTTGGTTTCCAGTTTTCTCCAATGTCGTGGTATTAATCAGGGCGTTATCTCTAGCACATAAACACACGTATGTGTATGTGGGGCGATGTCGGTGGCGATACCGTTCCATGTGGGGGTGTAGCTGTTGGGGGTATACGGGCCTGTTCGCCGTCCGATAGCGCGAAAGATACGACCTGGAAGTAGGAAACGAGACAGCGAGATAAGAAAGTAATATGGCGGCTGCTGCAAAGAGATAACGACTGATACGCGCCTGCACCTTTCCCGACCTGCAACTCTACGTGCCCATTATTTTTGGAAAATTCAATGAGAAATCCG
Sequence-based reagentAm_ex_20p3This paperTCTTGAGATCTTTCTGCATATAGCCGTGGTCTTCTTGCCCTCCCTCGCCCTCTGGCCCCGGACACCATCCACGGAGCTCCTTCCCTTCCCTTCACCGAATATACTCGGCTGTGCAGCGCCTGCTACCCTCTCGCTCTACTCTCTTGCCTTTCCACCAGTCATGACAAGCCGGTCCGACTGGTACCCCCACCAACGCGGCCGGACGGACCCTTCTTGGGCCTCGCAAGGGCCCTGCGGGACCCCTCCCTCCTACATTCCAGCGGGGCCCCATCACGGCGAGGCTGAGCTGGCGGGTTTTGAGGCGCGCGAGCCATGCCACGACAGGAAAAAAATGCATCTGAAAAACGAAAACAAGTAGAAAAAGGTGGCTCACACCCCTGCATGCGTGCGTCGGTTTGCGTGAACGTTGCCCGGACCCCGTACCGAGGCCTCCTCCTCCTCCTCCTTTCTCCTCCTTTCTTCCTTCCTTTCTTTTCGCTCCTCTACCTCCTCTGCGCGCCTCTTTACGCTCCTCTTCTTGCTCTACGCTCTCGCGTACGTGCCCGCAAACTGCTGCCTGCCTGTTCAACGCTTCTTCTTCGTTTCCTTCTTCTTCTTCTTCTTCTTCCTCCTCTCTGCTTCGTCCTTGCGTTTCTCTCGATTCGCGTCACATCTCCGCTCCCCCAAATACGTTTCCTTTCTAAGATCGTTT
Sequence-based reagentAm_klu_20p2This paperTTATTTATCGCCCTCGAAAGCGCTCGTCCTCTGCAGATTTCGATCGAGTCGTTCGACTTCGATATAAGAATTTCAGTGTAAACGCGATACACGTTAAATAACGAATATTTACGGACAAAGTCGGGCGAACGACGCGATCCTGGCCGCTCGTGGCCGATGCGCAGGAAGGTAGGAGAGCGGAGAGGTTTTACGCTGTTCGCGGAGAGGAGGATAGGTTCGTATAGCTCCTTAAATCAACCCTAGTTGGCCTGTCAACCGAGTTGGCGCGCGCGCGCATTCCCTTTCGCGGCGCACAAATTACCACGCGTTTAATTACCGTCCGATATACGAAGCAGGCTCATTAATCACCACGCCGATAACCCGTAATTTTCCAGCAACGATAAAATCTATCGCGCGACACCGGCTCTCGCGACTTTCCTCTCTCTCTCTCTCCCTCTCTCTCTCTCGTTCGAGGAGAAAGGAGAAAAGGAAGCAGGAGGACGGAGGAGGTGTGCAGAGCGATCCTGTCGCCGCTTCCATATAGATTTTTTTTCTCCCTTCCGCGCTCTTTCTCGCGCCAGTTCTCTTCGTGCGGCGGAAAATAGAGCGGCGCAACTCCCCTTCTCGCGACTCACGGAGGGCGAACAGCTGAAGCCGGCCGATCGATACGAA
Sequence-based reagentAm_ush_20p8This paperGGGGCTCCCTCCTCCTTCCCTCCTCCTCCGTCTGTCCCAGTTGGTCAGCCACGGTATCGTTTCGACGTCGATTCATCCCTCTTTTTCTCCCCCCTCTTCTCTCTCTCCTTCTGCCCCTCCCCCTCTCCCCTCCTCGCCATCGGCTTCGAGAGCCACGAGGCGATCGAGAGAGAGAGAGAGAGAGAGAGAAAGGGTACCCCATCGATGGATCGATCTATCGATCCACACGGGGATCCACCACGCTCTTCTGCCCTCTCCTCCTCCTCGCCACAATTTCTCTCCTCTTTACGTACGCTTCTCTCGTCCTTCGTGCCGCTCTTCGTCGCCATCGAGATTACGGCGAGCGAGGGGCCGACAGCCGAGGGGCTTCTTCCAATACTTTGTAAGTTTATTTGTATGATCCGCCAATACTTTGTATCTTTATTTATATGAAATCGGATGGCGGATCGAGATTGCTCTCTCTCTCTCTCTCGTGTCGCTCGTGTCTCGTCTCGCTTCTCCCCCCGTTTCCTCTTAAAATTAATTATACGTCCAAGGTGGGCGTAAGAGAGAGAGAGAGAGAGAAAGAAGTCGCAATGAAACCGGAAGGATAAAGAGAATCCGATGGTGCGCACACGCACGTGTATGTACACGTCCACTTTATAACACTCG
Sequence-based reagentAa_hth_35p9This paperTTCCGAACACCTTGATTCAAATCCGAACAGTAGGTACGAATAAATCATACCGTTTTGCTTCGAAATCTGGACACCTAAGACGAAGTGTATTTTCAAACTTGAATATATATATAGTAGAGATGGTCGGGTTTCACATTTTTCAAACCCGAACCCGACCCGTACCCGACTTATTTTATTTCTTCGAACCCGGACCCGACCCGAACCCGAGACCATAATTGAAAAGCAAACCCGGACCCGACCCGAACCCGAAAATTTTTCACAGTGCAAACCCGAACCCGACCCGAAACCCGAAAAATGTTTGTAAAAAAACCCGAATACAACCCGAGTTTGAAAAGATGGTAAATTCATCGTTTCTGATGCATAAAGAAGCTTTTAGATTGTTACTCTGTTCACAATTTTCACCAAACCCGACCTGAACCCGATTCAAACCCGACTTTTTGTAAGCCCGAACCCGACCCGTACCCGATAATTTCGTAGCCTACAAACCCGACCCGAACCCGAACCCGAAAAATTTCAAATATTCAAACCCGAACCCGACCCGAACCCGTCGGGTTCGGGTTCGGGTCGGGTTTCGCGTTTGAAAACCCGAGACCCGACCATCTCTAATATATAGGCAAATTCATACATACTAAAAATCCAATGGTGATTCTTGCTTCGAAATCCGGACAGCATGTGAGAGCCGATTCAAATATTGGACACATTTGCTTCGAATTCCGGACACTTCTATTTATCTAGTTTGTTCAAAGATTC
Sequence-based reagentAa_ubx_26p0This paperAAGTCGGGTTTAGTCGGGTTTGTGTCGGGTTTGGTGAGAATTGTGAACAGAGTGACAAACTAAAACCTTCCCTATGTATCAGAAACGATGAATTTATCATCTTTTAGAACTCGGGCTTTATTCGGGTTTTCCTTAGAAAACTTTTTCGGGTTTCGGGTCGGGTTTGGGTTTCAACAGCGAAAATTTTTCGGGTTCGGGTCGGGTCCGGGTTTGATTTTCAATTATTGTCTCGGGTTCGGGTCGGGTCCGGGTTTGAAGAAATAAAATAAGTCGGGTACGGGTCGGGTTCGGGTTTGAAAAATGTGAAACCCGACCATCTCTACTCTTCAGGTAGTCGAGAGTTGTTTTTTTTTATCTTTTATTTTTATTTTAAAGGCACTCTGTGCTCGTGCCCACTACTATGCCGAAATCAGTTCATCTGTATCTTCTTCACCGATTAAGATCTATTTTTAACTAATCTATATTTAAATCTACTTTCACTCTCTTCTACTCGTTTGCTCTCATACCGAGCAGGTAGGAGAGTGCTCTGCTGATAGTCCAATCGATTTCCATAAGCCATAGTTCCATTGCTCTTGCGGTGGTTCGTTTTGCCATGTTCCTGAGTCGTTTGAGGCTAGCTGCCTGCGAAGTGGGTCAGTTTGTCTCAG
Sequence-based reagentAa_psq_21p5This paperCCGAATTTGTGAGAGAGATAGGAGCCAATGTTTGAGTGATTCCCGCGAAGAATTGAAACCTATAAACGATTCCCACTAATTTTTGCAACATCTGTGATTTTTGATTTGATTTGAAACTGCAACTGACAGAAGATAATCAAAATACACTTTTTTCGCATTCGTACATCAATTGACAACCATCACTTGACACACCTGGCGATATGGACCAATAGGTCTGTGCCACAATAAGGGAGAGAAAAAAAAAAAAAAAGTGTGAAGCAAAAACACGCACATGTAACTTAAAGCACCACAAGAACCCTTTCAGCACCGGCCGCTTATGCTGATTTTATTAAAAAGCTTTATGCATACATGTACATAAGAGTGAGCATGCCGAAGCTCGAAAGTGTGTGTATGTGCGAATGCGCCAAAACACGATTATGTTTTCGTTTGTATTTCTTTCTTTTGCCGGCAAAAATTCTGTGTTTCGTTTTTTGATAGTAGGTAACTATGCCCACACAGTTACGGATCACACATAGTCATGGATCACTTTGGCGTTCAACATCGGATAACTCGCTCAAAACATATTTGCATGTGATGTAAACATATTTTTGCCAAGTCATAATATTTGTCTTCTGTCATTTGTAATATCAAATAATAAGCATAACATTAAACCGCAAAATAATGGTGTTTTTGAAAAATGTTTAGTTTGTATTGCCAAGCTATCATTAAATAGTCATTTATTGTAAGAAGTGCCATCAGCATTCTCCTATGCTTTTAGGTGATAAAATTCAAATATTATACATAATAGTTCCTCGTTCTCTTGAATTCAGTATGATTTCTTTGTTAGAAAACATTTTTCTTGTTTGTTGATACTGAATATTAGCAATTCCAACTAGTGATATTAGCAATTC
Sequence-based reagentTc_hth_15p5This paperTAATCTTTTAATTTAAAGCGTAGCTGAGCAGCTGGCTCTAATTCCACTTTCCTTATTTGGTTTCGTTGGTGTGGATTTTTGAAACGGATTATTTCGAGAAATAATAGTTTATTAGTGGTGGAAATAATGAATGGGTCTGGAGCGAGTTCCAGAGTGCGATTGGTTGGTTAGCGGGTAAATTTTTAAAAAGTGGGTGTCTTCTCCGACGGCAATTTAACGATCGTAACGACGTCGTCGCTAATTAGGCTCGTTGAGGCCGTCGCTAGATCGATAACACAGGCTGCGACATCGTCACAATGCACCGGTCGGGTTACACATCGGAGTCCGTCTCCCGGGGGCCCGTCTCAGATTCTCCGTATTAAAACACCGACATGTAAAAATATGGAAATTGCGCGCGGCAGAATGCGGTCCGATCAACCGGATGGCCATCGCGCATCGCTTTGCATTCGCAGCCGCATTTAAATTGCTAAAAGGGGACACTATCGAGCGGTCCATCTCTCTCGCAGCGTTGCGATATTATAATCTTGTTGCAAGGTAAATGCACATAACCGGTTACCCCAGACAGACGACGTCTTTGACACGAAAAAACCTGCCATCTATGTACAGCGGATCCTAATTTACGGCCTTATTCCATGTCATTAAGAGCATACGGGACGGACACGTTTTTAGGAACTTCGGACCCGACTTATCTCCGCGGACCGATAAGGAAATGTGCCTCTGGACACCTAACTTTGCCGACCAACAAAATCATAACGCTCGCTCTATGCCCATTGGGCAACACGAAAAAACCTGCC
Sequence-based reagentTc_Ubx_17p4This paperTGCATGTATGTCGAGTGGGTCCGGATGATGCGAACTCCCGCCGATTTCTTCGCAATCTGCAAATTCGCTCAAGTAGCTTAATAACAATGACAAAAGTGAGGCGGTATATTTCCGGCCGTCCGTTGAAAATTGTAATGATGTTATTAAAATTATGACGTGGCCGTGATGGTCGCCGAATTCTGGCGAAACGGCCGCGTAAAAACGGCACATAATTGGCTGACATTAAGATGTATCTGGAGATGTTTTTCGAATGCCTTCGTCCGGCGCGAATGCCTGAATAAGCGGCAAAGCTCGGAAAGCTCTTATAAATAAAAATGTACGGAGCCAATCAGATCGGCGAGTAAAAAGTACGTCTTTTCTTACACCAGAGGATCGCAGCTGCCGCAGAATCCGGTCGCGGATAAGAAATAAGAAGTGCTGCATAAATGCATTGATCATTCGCCGGGTCTCCGTCTGCTGTTCCTCAGCGAGAAAACGGGTTTAAGTCTGGATACTTTTGGCTCTCTGGAAAGTGCTTTTTGCATTAAGCTGCCGAGAGAGAATAAAGACGTTTGCGGTGTCGGACGGTGACCAATGCTGCTGCTGCTGCTCTGCCTTCCAAGTGCGTGCTTTAAATCTTCCACTTTGCAAGTAAATCGAGACGAACGCTGAATATTTTACACGAACACTGTTTATAGCCCAAATAACAGCCTTCCAAGGGCGGCCACCATCAAAAAATGGAGCGCTCAAACCCGAAATATGGGCGGGCGAAAATTATTCAAACCACAAAGCGAGGAAATCAGAAATTCAAAAATTGACGGCTTTCAACTCAGGACTGAATTTTTATAAATTTTTGTTCGCTACTGCAATTTGGGACAGAAAATTACATCT
Sequence-based reagentTc_Ubx_19p9This paperCGTATTTAAATATCGTTAGGTTCGATGGTAAAATTGGAGAAAATTGTCGCGCGCGTTTAAGACAAAGAAAATTCCCGTCGGGTTATCAATCTTGGGTTATCTGTACCCTCGG
GCCGAAAAACTCTGTAAAGAAGAGACAAAAGGACGTGACAGTCCAATTTCCATTTCAGATCGAAATTGTTCGCCCCCCGGAAGTTTATCGGGGCCCGTTGGCGGAATTAATAAATTGGTGCGCGACTTAATTGCGGCGATAAAGAAGAGAAGAACACGAATGAGGGACGGCGACAAAAATATTATTTGCTCGTGAACGAGGAGGCAAAGGGCATTGATATCTCGTGCAACGCCGGATATTGGCTGCTTCTGGTCGCGGTTTGCGGGGCTTCTAAGACTGTGCAGGGTTTGGGGAGCGGCCCCGAGCTCGAGAGAAATTATGTACGAGGCATTGGGAGCAATATATCTCCGGCGGGACGTGCCAGACAGAGTAGACGGGGTATTATATAGGAAGGAAGGAACCTGAGGCCGGGGCCGGAGCCTCCTCGTCCCCAGGCGCTCGTCCCCCAGAATGAGACACTTGCCGCCAAGTCCACCGCCTTAAATTGTCATCTGAAGAAAGAAACTTCATTACGAACTACGCCCTCATTTCTTTGCGAGGCGATCCATCGCGCAAAAGCAACGCACGCATTTTGCAACAACTATTCAACCACTAAAATTAAACGAATTTCAAACCTATTCCGGATTAATGATTTCCTCCTCGATTCAAGCTAATTGGGTGTTTCCTAG
Sequence-based reagentTc_psq_19p7This paperGGATTTTTTAGATAGATCATCAAGTTAAAAGTGCTTCGAATATATGTCATCAAAAATAAGATCAACTGATGGCTTTCTTTGCTTTATTCCCAATCTACTGTTAGAAAATCAACAACAACTAAGTTTTCTGTAAAATATAGTTCTTTCGGTGGCAAGAATAATATTATAATCGGGTTTCTTCTGCCTTATATTCTGTTTTCTTTGCTCCTATGTTAGTGCAAGTGTGTAACTTGGCGAACTCTTTCGAATTATCAAGGAAGTGTGAGTTTTATGAGAAAACAGCTAAAGTCGCCCCTAATTTGTTGACTTATTTGCTTTCGTTGGTTCTCCCGTTCTTTGGAGTATGTCGTCCGGTTTTTCTTATGAGCCATAATTACAAATTTCCATTTTCGGTTTTCGGCTCGCGTTCGTTTTGGAAAAGAGCGAATGTGCGGCGCGTTCATTTTCAATTTTGCGCGACCGTCCGACATTTTCCAATTTTCCGTGCAAGGACGAGGAGCGAGTGCAAAAAATGGCAGTCCTTGTCTGCAAAAAGCCCCAATTAAAACCGAAGTTGTAGTAGTGCGTGCGCCGAGCATTTCTCTCGATCTATCACGGGGTAGCAGCATCCCTCCGTAGGCTCACTCTCTGGCCAGTCTTAGTTTGCGCTTTCCCCGGAATTCACTGAAGGTCGTCGAGGTCGCAAGTAAGTACACAGTGCATGTGCACTTGCATGCATGCTTGCACTTTCTGTGCCCCCGCGCGCCGCCGCCGCCGCCGCATTAGCGTCTCTGTTTTGGTCCTTATATCCATCCGCTGTTCCCTTCTTCTGTCTATCCTTCAACTTCCTTCGCCGCTCGCCAGCTCCGGGACGCCACTCCATCATAAAACTGCGACCGCAAAAGCGACACTCATTATCGATTGCTCCAAGACGAATTAAAAGCCGCAGCGCTCCCCAAAACCGGGTTATTTTTTTCGGAATTTTGCTGGCTTGGAGCGGACTCCCAGACGATCCCCGGACTAATCCGGAGGGTTGCCTGGCGAGCGGCATTCGGCTTTAGGCTCCGGGGCACGCATTGGGGGAAAGTGATGCGGTGTCTGGACCAATCAATACCGGGTTCAAGGACGGCTTCTCTTATATGTGTATGTGAGCTTCCTTTTCCCGCTCGTCAAAACGGGACAAGACGGGAATTAATTGCACGACAATTGGGACGCCGACTCCACAGATGGGGCGACAAAATGGACGCAACGAACTAAATCTATTCACTTTT
Sequence-based reagentAm_Ubx_0p39This paperTCTCTCTCTCCTCGAGTGTAGCATATATCCATTCCACCATCGATCGAGGATTTCGATCCCCCTTGGACTCATGCTGCGATATTCGATCGTCCCTCCCCCCACTCCTCCGCGCTCTCATTCGATCCTTCTTTTCTTCCCTCCCCCCCAACCACTTTGATCCTTCTCTCTCTCTCTCTTCCTTCTTTTTACTTCTTCTTCTTCTTGCTGCTGCAACTACCCGCTGCCTCTAACCGCTAGCCGGACAAAACATTTCTTAATTGGGTTTCGTTCGGAAAGAACCGTCCGATTTCGTTTCGCAAGGGATCCAGCCTGCTGCTTCTGCCGGTTTTACCGCGTCTCTACGTGGCTTCTGTCGTTCCCTCCTCGTTCTGCTTGCCTTTCCTTCGAACGATTATTTATTTCGTCGTTCGAATTCCTTATTTTTCCATCCTGTTATCCCTTATTGTAATAAAGTAAAAATAATTGAATTTTCCTTCGAACGAGCGAAGTTTGTTCTAATC
Sequence-based reagentAm_Ubx_37p2This paperAGAGAGAGAGAGAGAGAGAAAGTCAGGCAGACGGAGACAGAGGAATGGGTTGGGTAAGGGGGATAGAGTAGGGGCGGGAGGGCGTTCCACGGCACCCTGCATGGGGTAGCTTGCAACCTCACGCGACACTAGAGCCATCTATATCCCCGGAGATTTATGAGTTCCTGGTGCAGCGGCTGCTCGCAGCAACTACACACCACGCAGTATCGGGTCCGGTGTTGGTGCTGCCCCCTGTCGCGACGGGCGTGCTGTTGCCCCGCGGGGGTTACGCGCAATTCGCGCTCCGTGCAACGTCGCCCTGATAAAAAACTCTTGCGACTCGATCTCAATCCCGATGCTTCTGCGAACCTTCCCTCCGTTCTCGCGCCGTCTCGTCGTGCGCCCGTCGCGGTCGTCTCTCTCTCTCTCTCGCTCTCCGGTGTTGGCGGGCTATCGGATCTTCTCTCTCTCTCTCGCTCACTTGGTTCGCTTCTTGCTTTCGATGCGACGCGACGACCAGCGATCTCACTTTCTCTCTCGCTCTCGCTTTCGAGCTCGCACTTGAAATATCGATCATCATTGTGTTTCCTACGCATTTGTAGACCGCAAACGCGAAATTATTATGGGCCTGTGCACGTTTGAATTTCTTATATTCTTTTTTTTCCATTATACGCTAGGTTAGCGTAGATATAATTCTGCTAAATATAGTGAAGATAATTCGAATTTAAATTAAATTGAAAATTTTCGTATTACATAATACTGTTTCGTTATTTATAACTGATTAGAATATTTATTGATACCAAATGAAATTTTTGGTAAACTCTCGAACATTGTTTCATTCTTCTATATCGTATTGGTGAAAAATTACATCTCGATTTTTTCTCACGAACTTATATCGCGGTAAAAGAACTGTGGACAACTGTGCAGCATCTCCTCGCTCGATGAAGTCATTTGAACGAGCATTCCTCGGCCGATCTCAGATACAATCTCCTTCAAACAAAGAGCTCCATTGCCGCGTGCA
Sequence-based reagentAm_psq_29p2This paperTCGCACGAGTACATAACGCTACCTTTGTCGCGTCGAAGGTAGAGGCACGATTCTGTCCTTTCCCGTTCTCTCGCGAACCTTGCATCCGTCTTCGTCTCGCTGTGGCCAAACGCGTGCTAGGTCTTCGTCTTCCACATTCCGTCTCGTTCGTTTCCGCACAGACTATATTTCTGTTCTCGTTTAGCCGCGGAAAGTCTTGCTCGCTCCCACGGGAACCACTCGTCGATGCTCGTCGCTTAACCGTCAGAGGCGAGCGCGCATTTCTCTCAAACACCGCAGACTTGCCTCTCCGCCGATCCCCGTTCCCACCCCCGGTGCTCGATGCTCTCTGTCACCCCTCCACCAAACGGACTCCTACCGGCCCGCTCCCCTCGCTTTGCGCCGCTTTCCACCAACCGTCCTGCCACCCGCCGGTTTTCAACCCCTTTCCCCGCTCTCTCGGCGACTGGTCAGGTGCGCTCGCTCGCTCGCTCGCTCCACGCGTACGCTCAATCGCTCTCTGTCCACCGCCGAGCACGCATCCCCCGCGAGTCTCTTCCTCGTTGTACGCGCTCGAGCGCGGATTCAATCCGTCCTTGTTCGTCGCGTCGGCGAATTTCGCGGCGTCCTCCGCCGCCGCCGCCGCCGCCGCCACCTCTTCCTCCTCCTCCTCCGCCTCCTCCTCCTGCTGATACTCCTCTTCCTCCTCGGT
Software, algorithmSCRMshaw pipelineAsma et al., 2024, doi: 10.17504/protocols.io.e6nvw1129lmk/v2
OtherREDfly databaseKeränen et al., 2022, PMID:35886794RRID:SCR_006790SCR_006790Database with results information, see “Results—An insect regulatory annotation resource”

Additional files

Download links

A two-part list of links to download the article, or parts of the article, in various formats.

Downloads (link to download the article as PDF)

Open citations (links to open the citations from this article in various online reference manager services)

Cite this article (links to download the citations from this article in formats compatible with various reference manager tools)

  1. Hasiba Asma
  2. Ellen Tieke
  3. Kevin D Deem
  4. Jabale Rahmat
  5. Tiffany Dong
  6. Xinbo Huang
  7. Yoshinori Tomoyasu
  8. Marc S Halfon
(2024)
Regulatory genome annotation of 33 insect species
eLife 13:RP96738.
https://doi.org/10.7554/eLife.96738.3