Overview of TaG-EM system

A) Schematic illustrating the design of the TaG-EM constructs, where a barcode sequence is inserted in the 3’ UTR of a UAS-GFP construct and inserted in a specific genomic locus using PhiC31 integrase. B) Use of TaG-EM barcodes for sequencing-based population behavioral assays. C) Use of TaG-EM barcodes expressed with tissue-specific Gal4 drivers to label cell populations in vivo upstream of cell isolation and single-cell sequencing.

Structured pool tests

A) Overview of the construction of the structured pools for assessing the quantitative accuracy of TaG-EM barcode measurements. Male and female even pools were constructed and extracted in triplicate. The table shows the number of flies that were pooled for each experimental condition. B) Barcode abundance data for three independent replicates of the female even pool. C) Barcode abundance data for three independent replicates of the male even pool. D) Barcode abundance data for the female staggered pool. Inset plot shows the average observed barcode abundance among lines pooled at each level compared to the expected abundance. E) Barcode abundance data for the male staggered pool. Inset plot shows the average observed barcode abundance among lines pooled at each level compared to the expected abundance. For all plots, bars indicate the mean barcode abundance for three technical replicates of each pool, error bars are +/− S.E.M.

TaG-EM barcode-based behavioral measurements

A) TaG-EM barcode lines in either a wild-type or norpA background were pooled and tested in a phototaxis assay. After 30 seconds of light exposure, flies in tubes facing the light or dark side of the chamber were collected, DNA was extracted, and TaG-EM barcodes were amplified and sequenced. Barcode abundance values were scaled to the number of flies in each tube and used to calculate a preference index (P.I.). Average P.I. values for four different TaG-EM barcode lines in both the wild-type and norpA backgrounds are shown (n=3 biological replicates, error bars are +/− S.E.M.). B) The same eight lines used for the sequencing-based TaG-EM barcode measurements were independently tested in the phototaxis assay and manually scored videos were used to calculate a P.I. for each genotype. Average P.I. values for each line are shown (n=3 biological replicates, error bars are +/− S.E.M.). C) Flies carrying different TaG-EM barcodes were collected and aged for one to four weeks and then eggs were collected and egg number and viability was manually scored for each line. In parallel the barcoded flies from each timepoint were pooled, and eggs were collected, aged, and DNA was extracted, followed by TaG-EM barcode amplification and sequencing. Average number of viable eggs per female (manual counts) and average barcode abundance are shown both as a bar plot and scatter plot (n=3 biological replicates for 3 barcodes per condition, error bars are +/− S.E.M.).

Gal4 driven expression of GFP from TaG-EM lines

A) Detailed view of the 3’ UTR of the TaG-EM constructs showing the position of the 14 bp barcode sequence (green highlight) relative to the polyadenylation signal sequences (underlined) and polyA cleavage sites (red highlights). pJFRC12 backbone schematic is from (Pfeiffer et al., 2010). B) Comparison of endogenous GFP expression and GFP antibody staining in the wing imaginal disc for the original pJFRC12 construct inserted in the attP2 landing site or for a TaG-EM barcode line driven by dpp-Gal4. Wing discs are counterstained with DAPI. C) Expression of GFP from either a TaG-EM barcode construct (left column), a hexameric GFP construct (middle column), or both a line carrying both a TaG-EM barcode construct and a hexameric GFP construct (right column) driven by the indicated gut driver line (PMG-Gal4: Pan-midgut driver; EC-Gal4: Enterocyte driver; EE-Gal4: Enteroendocrine driver; EB-Gal4: Enteroblast driver).

Expression of TaG-EM genetic barcodes in larval intestinal cell types

A) Number of doublets removed due to co-expression of each of the pairwise combinations of TaG-EM barcodes. B) UMAP plot of Drosophila larval gut cell clusters after TaG-EM barcode-based doublet removal. UMAP plots showing gene expression levels of enterocyte marker genes C) Jon99Ciii, D) betaTry, E) LManVI, and F) the TaG-EM barcode (BC4) driven by the EC-Gal4 line. UMAP plots showing gene expression levels of enteroblast/ISC marker genes G) esg, H) klu, I) E(spl)mbeta-HLH, and J) the TaG-EM barcode (BC6) driven by the EB-Gal4 line. UMAP plots showing gene expression levels of enteroendocrine cell marker genes K) Dh31, L) IA-2, M) Orcokinin, and N) the TaG-EM barcode (BC9) driven by the EE-Gal4 line.

Sanger sequencing identification of TaG-EM barcode lines

A) Summary of barcode pool injections. Barcode sequence and transgenic vial identifier in which the barcode was identified are shown. B) Sanger sequencing-based confirmation of the barcode sequence and PCR handle in TaG-EM transgenic lines.

Optimization of TaG-EM barcode amplification

A) Gels showing bands produced when amplifying TaG-EM flies or a wild type control with the indicated polymerase, annealing temperature, and primer pair (short = B2_3’F1_Nextera/ SV40_pre_R_Nextera; long = B2_3’F1_Nextera/ SV40_post_R_Nextera). B-E) Mean error (R.M.S.D. root mean squared deviation from expected value) for even pool amplified with the indicated primer set, input amount, and cycle number using KAPA HiFi polymerase (n=3, error bars are +/− S.E.M.). F-G) Mean error (R.M.S.D. root mean squared deviation from expected value) for staggered pool amplified with the indicated primer set, input amount, and cycle number using KAPA HiFi polymerase (n=3, error bars are +/− S.E.M.).

Oviposition tests with TaG-EM barcode lines

Plots showing mean TaG-EM barcode abundance for adult females used in oviposition experiments (top) and eggs collected from these females (bottom). Data from two independent trials is shown (n=3 for each trial, error bars are +/− S.E.M.).

Fecundity data for individual TaG-EM lines

Manually collected data for mean number of viable eggs per female, barcode abundance data, and barcode abundance data normalized to adult fly barcode data for each of the TaG-EM barcode lines used in the age-dependent fecundity experiment. Scatterplots show correlations between manually collected data and barcode sequencing results. Data from two independent trials is shown (n=3 for each trial, error bars are +/− S.E.M.).

Average age-dependent fecundity data for Trial 1

Average number of viable eggs per female (manual counts) and average barcode abundance are shown both as a bar plot and scatter plot (n=3 biological replicates for 3 barcodes per condition, error bars are +/− S.E.M.). Data from Trial 2 is shown in Figure 3C.

Expression driven by dpp-Gal4 for 20 TaG-EM lines

GFP antibody staining in the wing imaginal disc for the indicated TaG-EM barcode line driven by dpp-Gal4. Wing discs are counterstained with DAPI.

TaG-EM line GFP expression driven by different Gal4 drivers

A) Comparison of GFP expression in larvae for the original pJFRC12 construct inserted in the attP2 landing site (left) or for a TaG-EM barcode line (right) expressed under the control of the indicated driver line. B) GFP expression of the PC-Gal (Precursor-Gal4) driver line together with either UAS-2xGFP or a combination of UAS-2xGFP and a TaG-EM barcode line.

Dissociated intestinal cell viability

A) GFP expression visualized in dissociated cells from gut driver lines crossed to hexameric GFP and TaG-EM line. B) Proportion of live (left) and dead (right) cells post-isolation and flow sorting as assessed by GFP expression and propidium iodide staining.

Expression of TaG-EM genetic barcodes in larval intestinal precursor cells

UMAP plots showing gene expression levels of A) enteroblast/ISC marker genes esg, klu, and E(spl)mbeta-HLH; and B) the TaG-EM barcodes 7, 8, and 9 driven by the PC-Gal4 line.

Identification of doublets based on co-expression of TaG-EM barcodes

UMAP plots of all 28 pairwise barcode combinations showing cells co-expressing both of the indicated barcodes in blue (True).

Differentially expressed genes

Dotplot showing the top three differentially expressed genes by log-fold change for each cluster.

Top 25 marker genes for each cell cluster

Plots showing t-test-based rankings of the top 25 differentially expressed genes for each cluster.

Expression of the PMG-Gal4 driven TaG-EM barcodes

UMAP plots showing expression of the four PMG-Gal4 driven TaG-EM barcodes (BC1, BC1, BC3, and BC7).