Understanding how the brain encodes behaviour is the ultimate goal of neuroscience and the ability to objectively and reproducibly describe and quantify behaviour is a necessary milestone on this path. Recent technological progresses in machine learning and computational power have boosted the development and adoption of systems leveraging on high-resolution video recording to track an animal pose and describe behaviour in all four dimensions. However, the high temporal and spatial resolution that these systems offer must come as a compromise with their throughput and accessibility. Here, we describe coccinella, an open-source reductionist framework combining high-throughput analysis of behaviour using real-time tracking on a distributed mesh of microcomputers (ethoscopes) with resource-lean statistical learning (HCTSA/Catch22). Coccinella is a reductionist system, yet outperforms state-of-the-art alternatives when exploring the pharmacobehaviour in Drosophila melanogaster.
This study presents an important open-source resource for high-throughput behavioral screening. The protocols employ inexpensive, off the shelf hardware, and allow real-time analysis of hundreds of behaving flies. Although these protocols were developed using Drosophila melanogaster, they could easily be applied to other models. The evidence in support of the conclusions is solid and the revisions carried out by the authors go a long way towards providing the user with an integrated system that is also more user-friendly.https://doi.org/10.7554/eLife.86695.3.sa0
The nervous system integrates stimuli, internal states, expectations, and previous experience to regulate behavioural output. Describing, quantifying, and modulating behaviour are critical aspects of modern neuroscience and, ever since its inception, the field has spent considerable effort into building and sharing paradigms or tools aimed at objectively and reproducibly quantify behaviours in the most disparate animal models, to the point that this exercise is now recognised as an exciting subfield of neuroscience in its own right: ethomics (Brown and de Bivort, 2018; Datta et al., 2019). As the portmanteau name itself suggests, ethomics is not just about describing behaviour (etho-) but also about doing so in a high-throughput fashion (-omics), collecting data simultaneously from a large number of individuals, which can remain undisturbed throughout recording. Irrespective of the behaviour or the animal model to be analysed, the first compromise a researcher will face when choosing a tool for behavioural quantification will always be between throughput and resolution: a high-throughput analysis will allow for powerful experimental manipulations – such as genetics or pharmacological screens – offering unbiased approaches in identifying neuronal circuits, genes, molecules underpinning behaviour; high-resolution analysis, on the other hand, promises to identify and discriminate even minuscule differences that may not be immediately visible to the human eye, and to label behaviours into identifiable classes (e.g. ‘grooming’, ‘courting’, and ‘shaking’) that may be more relevant to researchers interested in modelling disease or in anthropomorphic descriptions. In the past years, the field has generally converged towards the adoption of high-resolution video recording of activity, in some cases adopting cameras that have milliseconds temporal resolution or developing setups that provide depth information for three-dimensional reconstruction of motion or posture (Wiltschko et al., 2015; Hsu and Yttri, 2021; Nath et al., 2019; Pereira et al., 2019; Gosztolai et al., 2021). Given the recent evolution in machine learning and progresses in computational power, even these high-resolution analyses can be at least in part compatible with high-throughput approaches (Wiltschko et al., 2020), especially when employed on small invertebrate animal models (McDermott-Rouse et al., 2021; Ayroles et al., 2015; Branson et al., 2009; Kabra et al., 2013) or when aided by robotic handling (Alisch et al., 2018). These systems, however, can still be prohibitively expensive for most laboratories, and not easily compatible with throughput in the ‘omics scale. Moreover, besides the technical urge of removing entry barrier and make ethomics an accessible tool, an equally important underlying question concerns what is the minimal amount of information that needs to be extracted to identify and classify behaviour. Do we always necessarily gain information from extracting micro-postural features or by analysing activity in three dimensions? To what extent this may actually add counterproductive biological noise to some assays?
Here, we introduce coccinella: a new experimental framework that combines high-throughput, inexpensive, real-time ethomics (Geissmann et al., 2017) with state-of-the-art statistical analysis (Fulcher and Jones, 2017; Lubba et al., 2019) to characterise and discriminate complex behaviours using a reductionist approach based solely on one simple feature (https://lab.gilest.ro/coccinella). Coccinella builds on ethoscopes (Geissmann et al., 2017), an accessible open-source platform, to extract, in real-time, activity information from flies. Despite its minimalist nature, coccinella outperforms state-of-the-art alternatives in recognising the pharmacobehavioural space, providing better discernibility at a fraction of the cost, thus opening a new path to high-throughput ethomics.
Drosophila ethomics studies generally rely on image acquisition through so-called industrial cameras, able to collect videos with high temporal and spatial resolution and featuring mounts for a large selection of lenses. Some of the cameras commonly used for these purposes (e.g. FLIR, Point Grey, Basler) (Mathis et al., 2018) are expensive and normally employed in close-up imaging that microscopically highlights the smallest anatomical features of the animal but at the same time greatly limits the number of experimental subjects that can be recorded by a single device. Normally, one or few more camera would be connected to a dedicated powerful computer for acquisition and storage of videos. The cost and physical footprint of these setups makes them incompatible, at least for most laboratories, with high-throughput simultaneous acquisition. To lower this barrier, we created a framework that employs the distributing computing power of ethoscopes (Geissmann et al., 2017), thus allowing for inexpensive analysis of activity in hundreds or thousands of flies at once. Ethoscopes are open source and can be manufactured by a skilled end-user at a cost of about £75 per machine, mostly building on two off-the-shelf component: a Raspberry Pi microcomputer and a Raspberry Pi NoIR camera overlooking a bespoke 3D-printed arena hosting freely moving flies. The temporal and spatial resolution of the collected images depends on the working modality the user chooses. When operating in offline mode, ethoscopes are capable to acquire 720p videos at 60 fps, which is a convenient option with fast moving animals. In this study, we instead opted for the default ethoscope working settings, providing online tracking and real-time parametric extraction, meaning that images are analysed by each Raspberry Pi at the very moment they are acquired (Figure 1b). This latter modality limits the temporal resolution of information being processed (one frame every 444 ± 127 ms, equivalent to 2.2 fps on a Raspberry Pi3 at a resolution of 1280 × 960 pixels with each animal being constricted in an ellipse measuring 25.8 ± 1.4 × 9.85 ± 1.4 pixels – Figure 1a) but provides the most affordable and high-throughput solution, dispensing the researcher from organising video storage or asynchronous video processing for tracking the animals. In the work here described, flies moved freely in a circular two-dimensional space with a diameter of 11.5 mm designed to maintain the animal in the walking position (Simon and Dickinson, 2010) while venturing on solidified agar providing nutrients alone or nutrients and drugs. In previous analysis of activity and sleep (Geissmann et al., 2017; Geissmann, 2018), we found that the maximal velocity of the fly over a period of 10 s best described the basic motion features of the animal, allowing us to accurately differentiate between different activity patterns, such as walking, grooming, and feeding (Geissmann et al., 2017). We therefore adopted this measure for coccinella too, ultimately producing monodimensional time series of a behavioural correlate, which were then digested using highly comparative time-series analysis (HCTSA) (Fulcher and Jones, 2017), a computational framework that effortlessly subjects time series to more than 7700 literature-relevant statistical tests, looking for meaningful discriminative features. Features successfully extracted through HCTSA were then used to classify behaviour using a linear support vector machine (SVMlinear) (Chatterjee et al., 2022; Figure 1b) and here presented and compared using confusion matrices (Figure 1—figure supplement 1a). To test the ability of this system to discriminate behaviour, we started by exploring the pharmacobehavioural space of flies fed with a panel of known or putative neurotropic chemicals, comprising molecules previously described in the literature along with uncharacterised ones being considered as potential insecticides (Figure 1c, d and Supplementary file 1). Using an initial panel of 17 treatments (16 drugs and 1 solvent control) we were able to discern compounds with an accuracy of 71.4% (vs. 5.8% of a random classifier – Figure 1c). Some compounds induced behaviours with a particularly high predictive fingerprint: dieldrin, for instance, was predicted with an accuracy of 94% and flonicamid with an accuracy of 87%. For others, our system fared more poorly (e.g. tetramethrin showed 41% accuracy). In all cases, however, the relative confusion was negligible, with all compounds being correctly identified as the first choice and with the first predicted compounds having, on average, a score that was 15 times greater compared to the second-best choices (Figure 1c – min.: 3.3×; max.: 65×). To validate our framework and exclude artefacts operated by overfitting biologically irrelevant information we followed two lines of control. Firstly, we fed flies with lower concentrations of the same compounds (Figure 1c). Feeding flies with different concentrations of drugs unsurprisingly showed a different effect on short-term lethality (Figure 1—figure supplement 1b), with several compounds hitting a 25% lethality rate before the end of the experiment when fed at the highest concentration (1000 ppm – Figure 1—figure supplement 1b). Lowering the compound concentration, the predictive accuracy of the system decreased from 71.4% (1000 ppm) to 61.8% (100 ppm), falling to 36.1% with the lowest concentrations (1 ppm), indicating that the system does operate on pharmacologically induced, biologically meaningful behavioural correlates (Figure 1c). A similar drop in accuracy was observed using a smaller panel of 12 treatments (Figure 1—figure supplement 1c). As a second line of work to test specificity, we obtained genetic mutants known to be resistant to specific pharmacological treatments: the paraL1029F allele encodes for a version of the α-subunit of voltage-gated sodium channel conferring resistance to dichlorodiphenyltrichloroethane (DDT) and pyrethroids (Kaduskar et al., 2022); the RdlA301S allele encodes for a version of the ligand-gated chloride channel conferring resistance to dieldrin and fiproles (Remnant et al., 2014). Challenging these mutants with their respective compounds created confusion in the clustering algorithm for which discerning between drug treatment and solvent control became a harder task, especially in the case of DDT and paraL1029F (Figure 1—figure supplement 2a). The observed drop in accuracy suggests again that coccinella is working on biologically relevant behavioural signatures and the fact that some discrimination can still be observed with targets harbouring point mutations – which should severely affect compound efficacy – is indicative of high sensitivity.
Having established the accuracy and sensitivity of the system, we next wanted to test its usefulness in a genuine high-throughput scenario. We subjected a total of 2192 flies to a panel of 40 treatments (Figure 1d), mostly featuring known compounds but also two unexplored molecules (Supplementary file 1). Given that the 100 ppm intermediate concentrations showed the best compromise between accuracy and lethality in the previous pilot experiment (Figure 1—figure supplement 1b), we performed this larger screen using compounds diluted at 100 ppm only. Even with such a large panel, the system was able to first-guess 39 out of 40 of the tested treatments (the only exception being the Syngenta Compound #5) with an overall accuracy of 44.5% vs. 2.5% of the random classifier.
A reductionist approach served us well so far, identifying with remarkable accuracy even subtle changes when we explored the pharmacobehavioural space of a large number of neuroactive compounds in wild-type and mutant flies. But how does it compare to other more established paradigms? Coccinella is arguably to be preferred in terms of accessibility and throughput, but what is the amount of useful information that we are sacrificing by adopting a reductionist approach? To quantify any possible loss in information content, we ran a series of parallel experiments in which flies were fed with a selected panel of 12 treatments (11 drugs and a solvent control, Figure 2) and their behaviour analysed either using coccinella or using other widely adopted state-of-the-art methods, which started with high-resolution imaging and employed supervised machine learning for pose-estimation DeepLabCut; Mathis et al., 2018 followed by unsupervised identification of behavioural grammar (B-SOiD; Hsu and Yttri, 2021). To widen the range of comparisons, data were then either immediately clustered using different common clustering algorithms (K-nearest neighbours, random forest) or first processed through a smaller, selected subsample of the HCTSA array (Catch22; Lubba et al., 2019) before being clustered using SVMlinear (Figure 2a). In this challenge, coccinella unambiguously identified 10 out of 12 compounds, with poor performance only for two of them (flubendiamide and tetramethrin) and an overall accuracy of 42.4% vs. 8.3% of the random classifier (Figure 2b). Surprisingly, none of the state-of-the-art high-resolution paths did better than this. The combination of pose-estimation → grammar extraction → random forest classification scored as the second best, with an accuracy of 25.4% but with only three compounds being unambiguously identified (Figure 2c). The same experimental dataset clustered with even poorer performance when using K-nearest neighbours (Figure 2e) and even the application of HCTSA features extraction to the B-SoID output still could not match the accuracy observed with coccinella’s reductionist approach (Figure 2d). This analysis is not meant to be conclusive. We expect that some alternative combination of state-of-the-art approaches will probably manage to match or likely improve over coccinella’s performance, yet the fact we could obtain such an impressive result with a system that is arguably unmatched in terms of throughput and economic cost is, alone, an argument that gives new weight to this (and future) reductionist approaches.
Finally, to push the system to its limit, we asked coccinella to find qualitative differences not in pharmacologically induced changes in activity, but in a type of spontaneous behaviour mostly characterised by lack of movement: sleep. In particular, we wondered whether coccinella could provide biological insights comparing conditions of sleep rebound observed after different regimes of sleep deprivation. Drosophila melanogaster is known to show a strong, conserved homeostatic regulation of sleep that forces flies to recover at least in part lost sleep, for instance after a night of forceful sleep deprivation (Shaw et al., 2000; Hendricks et al., 2000). We previously showed that the extent of sleep rebound observed after sleep deprivation only loosely correlates with the amount of lost sleep (Geissmann et al., 2019a) and it remains an open question whether similar amounts of sleep rebound may in fact differ from each other in some inscrutable feature that would underpin a different ‘sleep depth’ (Wiggin et al., 2020; French et al., 2021), similar to what it is believed to happen in mammals. Here, we analysed a dataset of 727 flies that experienced different regimes of mechanically enforced sleep deprivation during the 12 hr of the night (Figure 3). Flies were housed in tubes that would rotate after a set time of inactivity ranging from 20 to 1000 s leading to different degrees of sleep restriction (Figure 3a, dataset from Geissmann et al., 2019a). In this experimental paradigm, all treatments led to a statistically significant rebound compared to the undisturbed control animals (Figure 3b). We then ran coccinella on the two subsets of the panel: the baseline data, acquired the morning before the sleep deprivation (Figure 3c), and the rebound data, on the morning after (Figure 3d). Unsurprisingly, we could not detect any internal biological difference in the pre sleep deprivation control set, featuring flies of identical genotype and age housed in different tubes before the sleep deprivation treatment. In these conditions, coccinella could not discern, and performed exactly as a random classifier would (9% vs. 9% – Figure 3c). However, analysis of those same animals during rebound after sleep deprivation showed a clear clustering, segregating the samples in two subsets with separation around the 300 s inactivity trigger (Figure 3d). This result is important for two reasons: on one hand, it provides, for the third time, strong evidence that the system is not simply overfitting data of nought biological significance, given that it could not perform any better than a random classifier on the baseline control. On the other hand, coccinella could find biologically relevant differences on rebound data after different regimes of sleep deprivation. Interestingly enough, the 300 s threshold that coccinella independently identified has a deep intrinsic significance for the field, for it is considered to be the threshold beyond which flies lose arousal response to external stimuli, defining a ‘sleep quantum’ (i.e. the minimum amount of time required for transforming inactivity bouts into sleep bouts; Shaw et al., 2000; Hendricks et al., 2000; Joyce et al., 2023). Coccinella’s analysis ran agnostic of the arbitrary 5 min threshold and yet identified the same value as the one able to segregate the two clusters, thus providing an independent confirmation of the 5-min rule in D. melanogaster.
System neuroscience is living a period of renaissance, and Drosophila is driving this revolution strong of the first full-brain connectome, a plethora of new genetic reagents that allow thermo- and opto-genetic manipulations, a galore of genetic transformants for circuit tracking and manipulation, and multiple tools for large-scale quantification of behaviour. Progresses in machine learning and computer power have had a massive impact on the field of ethomics, especially in achieving levels of anatomical tracking that allow mapping of the tiniest movements on an experimental animal model with the highest temporal resolution and with little human supervision (Pereira et al., 2019; Mathis et al., 2018). Most of these systems, however, rely on relatively expensive setups and do not scale easily to high-throughput experimental paradigms. They are ideal – and irreplaceable – to identify behavioural patterns and study fine motor control but may be undue for other uses. Here, we introduce a new framework, coccinella, that merges an open-source, economically accessible hardware platform (ethoscopes; Geissmann et al., 2017; Geissmann et al., 2019b) with a powerful toolbox for statistical analysis and clustering (HCTSA, Fulcher and Jones, 2017/Catch22, Lubba et al., 2019). Coccinella is a reductionist tool, not meant to replace the behavioural categorisation that other tools can offer but to complement it. It relies on Raspberry PIs as main acquisition devices, with associated advantages and limitations. Ethoscopes are inexpensive and versatile but are limited in terms of computing power and acquisition rates. Their online acquisition speed is fast enough to successfully capture the motor activity of different species of Drosophilae (Joyce et al., 2023), but may not be sufficient for other animals moving more swiftly, such as zebrafish larvae. Moreover, coccinella cannot – and is not meant to – apply labels to behaviour (‘courting’, ‘lounging’, ‘sipping’, ‘jumping’, etc.) but it can successfully identify large behavioural phenotypes and generate unbiased hypothesis on how behaviour, and a nervous system at large, can be influenced by chemicals, genetics, artificial manipulations in general. Here, we provided evidence that coccinella can be used to successfully explore and compartmentalise the pharmacobehavioural space and also showed that a reductionist approach can be employed to discern otherwise invisible shades of a very subtle naturally occurring behaviour: sleep. The success of Drosophila as experimental model was built on the many genetic screens of the 1900s. We propose coccinella as an accessible, pivotal tool to boost again this important line of work in any laboratory, without funding or access to technology being a discriminative factor.
Fly lines were maintained on a 12-hr light:12-hr dark (LD) cycle and raised on polenta and yeast-based fly media (agar 96 g, polenta 240 g, fructose 960 g, and Brewer’s yeast 1200 g in 12 l of water). Canton-Special (CS) D. melanogaster were used as the wild-type line for all experiments.
RdlA301S is derived from RdlMDRR (RRID:BDSC_1675), an Rdl allele isolated from a natural population in Maryland (Ffrench-Constant et al., 1990), and underwent isogenisation and selection on dieldrin to eliminate the metabolic resistance and maintain the dieldrin target site resistance (Blythe et al., 2022). The RdlMDRR was obtained from the Bloomington Drosophila Stock Center. For para: the L1029F mutation, located in the voltage-gated sodium channel paralytic, has been extensively reported to confer resistance to DDT and pyrethroids in many other insect species (called kdr, knockdown resistance, reviewed in Arena et al., 1992). In the Drosophila gene, kdr maps to L1029F and is equivalent to the often cited L1014F in other insects (e.g. Musca domestica; Dong, 2007). The kdr L1029F mutation in Drosophila Para was introduced via CRISPR/Cas9-mediated genome editing (see below). This genome edited generated mutation resulted in a similar resistance to DDT as previously reported (Samantsidis et al., 2020).
CRISPR/Cas9-mediated genome editing was used to introduce a point mutation L1029F in para-PBG isoform, CTT to TTT, L to F by homology-dependent repair using one guide RNA and a dsDNA plasmid donor. The strategy design, molecular biology, and screening were completed by Well Genetics Inc, Taiwan (R.O.C.). The cassette PBacDsRed contains Piggy Bac 3′ terminal repeats, the selection marker 3xP3-DsRed, and Piggy Bac 5′ terminal repeats. The selection marker 3xP3-DsRed contains Piggy Bac 3′ terminal repeats, 3x Pax3 and hsp70 promoter, DsRed2, SV40 3′UTR, and Piggy Bac 5′ terminal repeats. The dsRed marker facilitates the genetic screening and was excised by Piggy Bac transposase. Only one TTAA motif was left after transposition embedded in mutated intron sequence, and create a mutation G to A on X:16,486,649; X:16,486,649–X:16,486,646, CTAA to T TAA in intron. The CRISPR Target Site [PAM]: CACAAGATTGCCGATGACAA[CGG]. Guide RNA Primers: Sense oligo5′-CTTCGCACAAGATTGCCGATGACAA and Antisense oligo5′-AAACTTGTCATCGGCAATCTTGTGC. Upstream Homology Arm: 1027 bp, the +34,097 nt to +35,123 nt from ATG of para. Forward Oligo5′-GTTCACCAAACTCGGAATCG; Reverse Oligo5′-GTGGCCAAGAAGAAGGGAAT. Downstream Homology Arm: 1022 bp, the +35,128 nt to +36,149 nt from ATG of para Forward Oligo: 5′-CCATGGCTTTAAGCATCGCA; Reverse Oligo: 5′-TTATGACGGATACGGTTACGG. Synthesis fragment: 5′- GGTTGTCATCGGCAATTTTGTGgtgagtactcttatcgaactgctgacttgtaaacgatgtttactggctataatgctgacttatcgcct.
The Drosophila injection strain was white1118. 206 embryos were injected. 36 G0 crosses were established. Of the 78 positive lines crosses, in the F1 screen 25 positive lines were identified. Seven lines were positively validated by PCR and 1 line was sequenced confirming no unexpected changes in para. Lines were isogenised and balanced. DsRed was excised using PiggyBac (PBac) Transposase Bloomington Stock RRID:BDSC_8285. Excision was validated by genomic PCR and sequencing. Resulting lines were hemizygous viable. The line used in study had internal identifier: 20256ex1.
The initial preliminary analysis was conducted using a group of 12 compounds ‘proof of principle’ compounds and a solvent control. These compounds were initially used to compare both the video method and ethoscope method. After testing these initial compounds, it was found that the ethoscope methodology was more successful, and then the compound list was expanded to 17 (including the control) only using the ethoscope method. As a final test, we included additional compounds for a single concentration, bringing up the total to 40 (including control), also for the ethoscope method. All insecticide compounds were supplied by Syngenta Ltd from their in-house stock (see Supplementary file 1 for a full list of compounds used). Compounds were received in solid form and diluted in solvent containing 5% ethanol (VWR, 20821), 5% acetone (Sigma, 179124), and 10% dimethylsulfoxide (D2650, Sigma) in distilled water to 1000 ppm initially, then further diluted in the solvent mixture to 100 and 1 ppm (where 1 ml/l = 1000 ppm). For insecticide assays, 0.5 ml of 5% sucrose (Sigma, S0389), 1% agarose (Sigma, A6236) solution was pipetted into each well and allowed to set. Following this, 2 µl of compound solution were placed on the surface and allowed to dry for 30 min or more. Male flies were then placed on the surface with a small glass cover slip placed on top (13 mm circular cover slip, VWR631-0150). Flies were briefly anaesthetised (>1 min) before being placed onto the surface of the plate. Once each well had been filled with a single male fly, arenas were placed into the ethoscope and recorded for a minimum of 2 days. All experiments were started between ZT0 and ZT1 and within 30 min of the flies being placed in the wells. For each compound in Figure 1c and Figure 1—figure supplement 1c, three repeats were done at different time points. For Figure 1d, two repeats to three biological repeats per compound.
Ethoscope data were first processed in R using rethomics (Geissmann et al., 2019b). Each time series was exported (.csv) and converted using Python to individual time series in a file format (.dat) compatible with MatLab. A metadata (.txt) file served as a reference file of each individual time series with keywords outlining compound groups and concentrations for processing data using HCTSA.
Following this process, HCTSA feature extraction was performed on the time-series data (for Figure 1c, d and Figure 3c, d, Figure 1—figure supplements 1 and 2a). After the features were extracted, outputs of error-producing operations were removed through a normalisation process using a sigmoidal transformation. HCTSA inbuilt functions were then used to classify data using a linear SVM classifier and a confusion matrix comparing the time series was generated. For some of the time-series data (that in Figure 2b, d, Figure 1—figure supplement 2b), a smaller feature set of HCTSA, Catch-22 was used for feature extraction. Due to the smaller number of features used with this method, normalisation was not required before using a linear SVM classifier to generate a confusion matrix comparing the results. A time series of 12 hr was used for the HCTSA analysis in Figure 1 and Figure 1—figure supplements 1 and 2. A length of 3 hr was used for the HCTSA analysis in Figure 3. All video data in Figure 2 are from time series of 15 min. By always running the full set of features on aggregate to train a classifier (e.g. TS_Classify in HCTSA), no post hoc correction is necessary because the trained classifier only ever makes a single prediction (only one test is performed).
Custom 3D-printed squares were designed using the online CAD software Onshape and printed using Ultimaker 2+ 3D printers using PLA plastic. For insecticide assays, 0.5 ml of 5% sucrose (Sigma, S0389), 1% agarose (Sigma, A6236) solution was pipetted into each well and allowed to set. Following this, 2 µl of compound solution were placed on the surface and allowed to dry for 30 min or more before male flies were placed on the surface with a small glass cover slip placed on top (13 mm circular cover slip, VWR631-0150). Flies were briefly anaesthetised (>1 min) before being placed onto the surface of the square. Once each well had been filled with a single male fly, squares were placed in the arena and a video was recorded for a minimum of 12 hr using an ELP 8 megapixel camera with an IMX179 Sensor and 2.8–12 mm variable focus manual lens. All video recordings were started between ZT0 and ZT1. Recordings of flies exposed to compounds were done in a randomised manner. Video data were then broken down into shorter segments of 15 min videos for processing. The first 15 min following 1 hr of fly exposure to compound or control was used for pose-extraction.
The use of DeepLabCut (version 2.1) followed the detailed protocol outlined by Nath et al., 2019. Briefly, frames for labelling were extracted from 3 representative videos using a K-means algorithm and frames were labelled with 22 unique body parts (head, left eye, right eye, thorax top, thorax bottom, abdomen top, abdomen middle, abdomen bottom, left wing tip, right wing tip, left foreleg tip, left foreleg middle, right foreleg tip, right foreleg middle, left middle leg tip, left middle leg middle, right middle leg tip, right middle leg middle, left back leg tip, left back leg middle, right back leg tip, and right back leg middle). These frames were labelled locally with a DeepLabCut graphical user interface before the project file was uploaded to Google Drive for training and video analysis to be done using Google Colab. The data were split into a 9:1 test:train dataset and training was run for more than 150,000 iterations before the average Euclidean error was computed between labels and predictions. The model at the best performing checkpoint was used to predict pose in novel videos. Following this, B-SoID was used to de-structure behaviour using the output of DeepLabCut and generate fly-specific time series of behavioural grammar as in Hsu and Yttri, 2021.
Maximal velocity time-series data generated from recording flies exposed to either control conditions (no SD) or incrementally increasing immobility-triggered SD conditions were taken from the dataset generated by Geissmann et al., 2019a and analysed as above. Only data for female flies were included in this study, limiting to a sample of 60 individuals per experimental group. Wherever experimental groups consisted of more than 60 individuals, 60 individual flies were randomly chosen.
Flies were fed with the specified compounds at the desired concentrations and concomitantly analysed in ethoscopes for 24 hr. Time of death was calculated as the last moment of detected motion.
A notebook version of the source code used to generate all figures is available on the Zenodo public repository, along with all the metadata and the raw data collected in this study (DOIs: 10.5281/zenodo.7335575 and 10.5281/zenodo.7393689). Data were analysed using rethomics (Geissmann et al., 2019b) and ethoscopy (Blackhurst et al., 2023). Two notebooks showing how to use the system for multiple uses are provided in Supplementary file 2.
Expression of a glutamate-activated chloride current in Xenopus oocytes injected with Caenorhabditis elegans RNA: evidence for modulation by avermectinBrain Research. Molecular Brain Research 15:339–348.https://doi.org/10.1016/0169-328x(92)90127-w
The mode of action of isocycloseram: A novel isoxazoline insecticidePesticide Biochemistry and Physiology 187:105217.https://doi.org/10.1016/j.pestbp.2022.105217
High-throughput ethomics in large groups of DrosophilaNature Methods 6:451–457.https://doi.org/10.1038/nmeth.1328
BookChapter 4 - feature selection and classificationIn: Chatterjee S, Dey D, Munshi S, editors. Recent Trends in Computer-Aided Diagnostic Systems for Skin Diseases. Academic Press. pp. 95–135.https://doi.org/10.1016/B978-0-323-91211-2.00001-9
Isolation of dieldrin resistance from field populations of Drosophila melanogaster (Diptera: Drosophilidae)Journal of Economic Entomology 83:1733–1737.https://doi.org/10.1093/jee/83.5.1733
BookHigh-Throughput Recording, Analysis and Manipulation of Sleep in DrosophilaImperial College London Press.https://doi.org/10.25560/69514
DeepLabCut: markerless pose estimation of user-defined body parts with deep learningNature Neuroscience 21:1281–1289.https://doi.org/10.1038/s41593-018-0209-y
Behavioral fingerprints predict insecticide and anthelmintic mode of actionMolecular Systems Biology 17:e10267.https://doi.org/10.15252/msb.202110267
Fast animal pose estimation using deep neural networksNature Methods 16:117–125.https://doi.org/10.1038/s41592-018-0234-5
The role of Rdl in resistance to phenylpyrazoles in Drosophila melanogasterInsect Biochemistry and Molecular Biology 54:11–21.https://doi.org/10.1016/j.ibmb.2014.08.008
What I cannot create, I do not understand’: functionally validated synergism of metabolic and target site insecticide resistanceProceedings of the Royal Society B 287:20200838.https://doi.org/10.1098/rspb.2020.0838
Revealing the structure of pharmacobehavioral space through motion sequencingNature Neuroscience 23:1433–1443.https://doi.org/10.1038/s41593-020-00706-3
In the current paper, Jones et al. describe a new framework, named "coccinella", for real-time high-throughput behavioral analysis aimed at reducing the cost of analyzing behavior. In the setup used here each fly is confined to a small circular arena and able to walk around on an agar bed spiked with nutrients or pharmacological agents. The new framework, built on the researchers' previously developed platform Ethoscope, relies on relatively low-cost Raspberry Pi video cameras to acquire images at ~0.5 Hz and pull out, in real time, the maximal velocity (parameter extraction) during 10 second windows from each video. Thus, the program produces a text file, and not voluminous videos requiring storage facilities for large amounts of video data, a prohibitive step in many behavioral analyses. The maximal velocity time-series is then fed to an algorithm called Highly Comparative Time-Series Classification (HCTSA)(which itself is based on a large number of feature extraction algorithms) developed by other researchers. HCTSA identifies statistically salient features in the time-series which are then passed on to a type of linear classifier algorithm called support vector machines (SVM). In cases where such analyses are sufficient for characterizing the behaviors of interest this system performs as well as other state-of-the-art systems used in behavioral analysis (e.g., DeepLabCut)
In a pharmacobehavior paradigm testing different chemicals, the authors show that coccinella can identify specific compounds as effectively as other more time-consuming and resource-consuming systems.
The new paradigm should be of interest to researchers involved in drug screens, and more generally, in high-throughput analysis focused on gross locomotor defects in fruit flies such as identification of sleep phenotypes. By extracting/saving only the maximal velocity from video clips, the method is fast. However, the rapidity of the platform comes at a cost--loss of information on subtle but important behavioral alterations. When seeking subtle modifications in animal behavior, solutions like DeepLabCut, which are admittedly slower but far superior in terms of the level of details they yield, would be more appropriate.
The manuscript reads well, and it is scientifically solid. The comments listed below were directed to the original submission and were satisfactorily addressed in the revised version.
1- The fact that Coccinella runs on Ethoscopes, an open source hardware platform described by the same group, is very useful because the relevant publication describes Ethoscope in detail. However, the current version of the paper does not offer details or alternatives for users that would like to test the framework, but do not have an Ethoscope. Would it be possible to overcome this barrier and have coccinella run with any video data (and, thus, potentially be used to analyze data obtained from other animal models)?
2- Readers who want background on the analytical approaches that the platform relies on following maximal velocity extraction, will have to consult the original publications. In particular, the current manuscript does not provide much explanation on Highly Comparative Time-Series Classification (HCTSA) or SVM; this may be reasonable because the methods were developed earlier by others. While some readers may find that the lack of details increases the manuscript's readability, others may be left wanting to see more discussion on these not-so-trivial approaches. In addition, it is worth noting that the same authors that published the HCTSA method, also described a shorter version named catch22, that runs faster with a similar output. Thus, explaining in more detail how HCTSA operates, considering is a relatively new method, will make the method more convincing.https://doi.org/10.7554/eLife.86695.3.sa1
The following is the authors’ response to the original reviews.
We thank the editor and the reviewers for their very useful and constructive comments. We went through the list and gladly received all their suggestions. The reviewers mostly pointed to minor revisions in the text, and we acted on all of those. The one suggestion that required major work was the one raised in point 13, about the processing pipeline being unconvincingly scattered between different tools (R → Python → Matlab). I agree that this was a major annoyance, and I am happy to say we have solved it integrating everything in a recent version of the ethoscopy software (available on biorxiv with DOI https://www.biorxiv.org/content/10.1101/2022.11.28.517675v2 and in press with Bioinformatics Advances). End users will now be able to perform coccinella analysis using ethoscopy only, thus relying on nothing else but Python as their data analysis tool. This revised version of the manuscript now includes two Jupyter Notebooks as supplementary material with a “pre-cooked” sample recipe of how to do that. This should really simplify adoption and provides more details on the pipeline used for phenotyping.
Please find below a point-by-point description of how we incorporated all the reviewers’ excellent suggestions.
Recommendations for the authors: please note that you control which, if any, revisions, to undertake
1. Line 38: "collecting data simultaneously from a large number of individuals with no or limited human intervention" is a bit misleading, as the entire condition the individuals are put in are highly modified by humans and most times "unnatural". I understand the point that once the animals are placed in these environments, then recording takes place without intervention, but it would be nice to rephrase this so that it reflects more accurately what is happening.
We have now rephrased this into the following (L39):
Collecting data simultaneously from a large number of individuals, which can remain undisturbed throughout recording.
1. Line 63: please add a reference to the Ethoscopes so that readers can easily find it.
(2b) And also add how much they cost and the time needed to build them, as this will allow readers to better compare the proposed system against other commercially available ones.
This information is available on the ethoscope manual website (http://lab.gilest.ro/ethoscope). The price of one ethoscope, provided all necessary tools are available, is around ~£75 and the building time very much depends on the skillset of the builder and whether they are building their first ethoscope or subsequent ones. In our experience, building and adopting ethoscopes for the first time is not any more time-expensive than building a (e.g.) deeplabcut setup for the first time. We have added this information to L81
Ethoscopes are open source and can be manufactured by a skilled end-user at a cost of about £75 per machine, mostly building on two off-the-shelf component: a Raspberry Pi microcomputer and a Raspberry Pi NoIR camera overlooking a bespoke 3D printed arena hosting freely moving flies.
1. Line 88: The authors describe that in the current setting, their system is capable of an acquisition rate of 2.2 frames per second (FPS). Would reducing the resolution of the PiCamera allow for higher FPS? I raise this point because the authors state that max velocity over a ten second window is a good feature for classifying behaviors. However, if animals move much faster than the current acquisition rate, they could, for instance, be in position X, move about and be close to the initial position when the next data point is acquired, leading to a measured low max velocity, when in fact the opposite happened. I think it would be good to add a statement addressing this (either data from the literature showing that the low FPS does not compromise data acquisition, or a test where increasing greatly FPS leads to the same results).
We have previously performed a comparison of data analysed using videos captured at different FPSs, which is published in Quentin Geissman’s doctoral Thesis (2018, DOI: https://doi.org/10.25560/69514 in chapter 2, section 2.8.3, figure 2.9 ). We have now added this work as one of the references at L95 (reference 19).
1. Still on the low FPS, would a Raspberry Pi 4 help with the sampling rate? Given that they are more powerful than the RPi3 used in the paper?
It would, but it would be a minor increase, leading from 2.2 to probably 3-5 FPS. A significantly higher number of FPSs would be best achieved by lowering the camera’s resolution, as the reviewer’s suggested, or by operating offline. I think the interesting point being implied by the reviewers is that, for Drosophila, the current limits of resolution are more than sufficient. For other animals, perhaps moving more abruptly, they may not. The reviewer is right that we should add a line of caveat about this. We now do so in the discussion, lines 215-224.
Coccinella is a reductionist tool, not meant to replace the behavioural categorization that other tools can offer but to complement it. It relies on raspberry PIs as main acquisition devices, with associated advantages and limitations. Ethoscopes are inexpensive and versatile but have limitations in terms of computing power and acquisition rates. Their online acquisition speed is fast enough to successfully capture the motor activity of different species of Drosophilae28, but may not be sufficient for other animals moving more swiftly, such as zebrafish larvae. Moreover, coccinella cannot apply labels to behaviour (“courting”, “lounging”, “sipping”, “jumping” etc.) but it can successfully identify large behavioural phenotypes and generate unbiased hypothesis on how behaviour – and a nervous system at large – can be influenced by chemicals, genetics, artificial manipulations in general.
1. Along the same line of thought, would using a simple webcam (with similar specs to the PiCamera - ELP has cameras that operate on infrared and are quite affordable too) connected to a more powerful computer lead to higher FPS? - The reason for the question about using a simple webcam is that this would make your system more flexible (especially useful in the current shortage of RPi boards on the market) lowering the barrier for others to use it, increasing the chances for adoption.
Completely bypassing ethoscopes would require the users to setup their own tracking solution, with a final result that may or may not match what we describe here. If a greater temporal resolution is necessary, the easiest way to achieve more FPSs would be to either decrease camera resolution or use the Pis to take videos offline and then process those videos at a later stage. The combination of these two would give FPS acquisition of 60 fps at 720p, which is the maximum the camera can achieve. We now made this clear at lines 83-92.
The temporal and spatial resolution of the collected images depends on the working modality the user chooses. When operating in offline mode, ethoscopes are capable to acquire 720p videos at 60 fps, which is a convenient option with fast moving animals. In this study, we instead opted for the default ethoscope working settings, providing online tracking and realtime parametric extraction, meaning that images are analysed by each raspberry Pi at the very moment they were acquired (Figure 1b). This latter modality limits the temporal resolution of information being processed (one frame every 444 ms ± 127 ms, equivalent to 2.2 fps on a Raspberry Pi3 at a resolution of 1280x960 pixels with each animal being constricted in an ellipse measuring 25.8 ± 1.4 x 9.85 ±1.4 pixels - Figure 1a) but provides the most affordable and high-throughput solution, dispensing the researcher from organising video storage or asynchronous video processing for animals tracking.
1. One last point about decreasing use barrier and increasing adoption: Would it be possible to use DeepLabCut (DLC) to simply annotate each animal (instead of each body part) and feed the extracted data into your current analysis with coccinella? This way different labs that already have pipelines in place that use DLC would have a much easier time in testing and eventually switching to coccinella? I understand that extracting simple maximal velocity this way would be an overkill, but the trade-off would again be a lowering of the adoption barrier.
It would certainly be possible to calculate velocity from the whole animal pose measurement and then use this with HCTSA or Catch22, thus mimicking the coccinella pipeline, but it would be definitely overkilled, as the reviewers correctly points out. Given that we are trying to make an argument about high-throughput data acquisition I would rather not suggest this option in the manuscript.
1. Line 96: The authors state that once data is collected, it is put through a computational frameworkthat uses 7700 tests described in the literature so that meaningful discriminative features are found. I think it would be interesting to expand a bit on the explanation of how this framework deals multiple comparison/multiple testing issues.
We always use the full set of features on aggregate to train a classifier (e.g., TS_Classify in HCTSA) and that means no correction is necessary because the trained classifier only ever makes a single prediction (only one test is performed), so as long as it is done correctly (e.g., proper separation of training and test sets, etc.) then multiple hypothesis correction is not appropriate. This has been confirmed with the HCTSA/Catch22 author (Dr Ben Fulcher, personal communication). We have added a clarifying sentence about this to the methods (L315-318)
1. It would be nice to have a couple of lines explaining the choice of compounds used for testing and also why in some tests, 17 compounds were used, while in others 40, and then 12? I understand how much work it must be in terms of experiment preparation and data collection for these many flies and compounds, but these changes in the compounds used for testing without a more detailed explanation is suboptimal.
This is another good point. We have now added this information to the methods, in a section renamed “choice, handling and preparation of drugs” L280-285, which now reads like this:
The initial preliminary analysis was conducted using a group of 12 compounds “proof of principle” compounds and a solvent control. These compounds were initially used to compare both the video method and ethoscope method. After testing these initial compounds, it was found that the ethoscope methodology was more successful, and then the compound list was expanded to 17 (including the control) only using the ethoscope method. As a final test, we included additional compounds for a single concentration, bringing up the total to 40 (including control), also for the ethoscope method.
1. Line 119 states: "A similar drop in accuracy was observed using a smaller panel of 12 treatments (Supplementary Figure 2a)". It is actually Supplementary Figure 1c.
Thank you for noticing that! Now corrected. The Supplementary figures have also been renamed to obey eLife’s expected nomenclature (both Figure 1 – Figure supplements)
1. In some places the language seems a little outlandish and should either be removed or appropriately qualified. a- Lines 56-59 pose three questions that are either rhetorical or ill-posed. For example, "...minimal amount of information...behavior" implies there is a singular response but the response depends on many details such as to what degree do the authors want to "classify behavior".
Yes, those were meant as rhetorical questions indeed, but we prefer to keep them in, because we are hoping to generate this type of thoughts with the readers. These are concepts that may not be so obvious to someone who is just looking to apply an existing tool and may spring some reflection about what kind of data do they really want/need to acquire.
b) Some of the criticisms leveled at the state-of-the-art methods are probably unwarranted because the goals of the different approaches are different. The current method does not yield the type of rich information that DeepLabCut yields. So, depending on the application DeepLabCut may be the method of choice. The authors of the current manuscript should more clearly state that.
In the introduction and discussion we do try to stress that coccinella is not meant to replace tools like DLC. We have now added more emphasis to this concept, for instance to L212:
[tools like deeplabcut] are ideal – and irreplaceable – to identify behavioural patterns and study fine motor control but may be undue for many other uses.
Coccinella is a reductionist tool not meant to replace the behavioural categorization that other tools can offer but to complement it
1. The application to sleep data appears suddenly in the manuscript. The authors should attempt to make with text change a smoother transition from drug screen to investigation into sleep.
I agree with this observation. We have now tried to add a couple of sentences to contextualise this experiment and hopefully make the connection appear more natural. Ultimately, this is a proof-ofprinciple example anyway so hopefully the reader will take it for what it is (L169).
Finally, to push the system to its limit, we asked coccinella to find qualitative differences not in pharmacologically induced changes in activity, but in a type of spontaneous behaviour mostly characterised by lack of movement: sleep. In particular, we wondered whether coccinella could provide biological insights comparing conditions of sleep rebound observed after different regimes of sleep deprivation. Drosophila melanogaster is known to show a strong, conserved homeostatic regulation of sleep that forces flies to recover at least in part lost sleep, for instance after a night of forceful sleep deprivation.
(11b) Additionally, the beginning section of sleep experiments talks about sleep depth yet the conclusion drawn from sleep rebound says more about the validity of the current 5 min definition of sleep than about sleep depth. If this conclusion was misunderstood, it should be clarified. If it was not, the beginning text of the sleep section should be tailored to better fit the conclusion.
I am afraid we did not a good job at explaining a critical aspect here: the data fed to coccinella are the “raw” activity data, in which we are not making any assumption on the state of the animal. In other words, we do not use the 5-minutes at this or any other point to classify sleep and wakening. Nevertheless, coccinella picks the 300 seconds threshold as the critical one for discerning the two groups. This is interesting because it provides a full agnostic confirmation of the five minutes rule in D. melanogaster. We recognise this was not necessarily obvious from the text and now added a clarification at L189-201:
However, analysis of those same animals during rebound after sleep deprivation showed a clear clustering, segregating the samples in two subsets with separation around the 300 seconds inactivity trigger (Figure 3d). This result is important for two reasons: on one hand, it provides, for the third time, strong evidence that the system is not simply overfitting data of nought biological significance, given that it could not perform any better than a random classifier on the baseline control. On the other hand, coccinella could find biologically relevant differences on rebound data after different regimes of sleep deprivation. Interestingly enough, the 300 seconds threshold that coccinella independently identified has a deep intrinsic significance for the field, for it is considered to be the threshold beyond which flies lose arousal response to external stimuli, defining a “sleep quantum” (i.e.: the minimum amount of time required for transforming inactivity bouts into sleep bouts23,24,28). Coccinella’s analysis ran agnostic of the arbitrary 5-minutes threshold and yet identified the same value as the one able to segregate the two clusters, thus providing an independent confirmation of the fiveminutes rule in D. melanogaster.
1. Line 227: (standard food) - please add a link to a protocol or a detailed description on what is "standard food". This way others can precisely replicate what you are using. This is not my field, but I have the impression that food content/composition for these animals makes big changes in behaviour?
Yes, good point. We have now added the actual recipe to the methods L240:
Fly lines were maintained on a 12-hour light: 12-hour dark (LD) cycle and raised on polenta and yeast-based fly media (agar 96 g, polenta 240 g, fructose 960 g and Brewer’s yeast 1,200 g in 12 litres of water).
1. Data acquisition and processing: please add links to the code used.
Both the code and the raw data used to generate all the figures have been uploaded on Zenodo and available through their repository. Zenodo has a limit of 50GB per uploaded dataset so we had to split everything into two files, with two DOIs, given in the methods (L356, section “code and availability” - DOIs: 10.5281/zenodo.7335575 and 10.5281/zenodo.7393689). We have now also created a landing page for the entire project at http://lab.gilest.ro/coccinella and linked that landing page in the introduction (L64).
13b) Also your pipeline seems to use three different programming languages/environments... Any chance this could be reduced? Maybe there are R packages that can convert csv to matlab compatible formats, so you can avoid the Python step? (nothing against using the current pipeline per se, I am just thinking that for usability and adoption by other labs, the smaller amount of languages, the better?
This is a very important suggestion that highlights a clear limitation of the pipeline. I am happy to say that we worked on this and solved the problem integrating the Python version of Catch22 into the ethoscopy software. This means the two now integrate, and the entire analysis can be run within the Python ecosystem. HCTSA does not have a Python package unfortunately but we still streamlined the process so that one only has to go from Python to Matlab without passing through R. To be honest, Catch22 is the evolution of HCTSA and performs really well so I think that is what most users will want to use. We provide two supplementary notebooks to guide the reader through the process. One explains how to go from ethoscope data to an HCTSA compatible mat file. The other explains how ethoscope data integrate with Catch22 and provides many more examples than the ones found in the paper figures.
1. There are two sections named "References" (which are different from each other) on the manuscript I received and also on BioRxiv. Should one of them be a supplementary reference? Please correct it. I spent a bit of time trying to figure out why cited references in the paper had nothing to do with what was being described...
The second list of references actually applied only to the list of compounds in the supplementary table 1. When generating a collated PDF this appeared at the end of the document and created confusion. We have now amended the heading of that list in the following way, to read more appropriately:https://doi.org/10.7554/eLife.86695.3.sa2
- Hannah Jones
The funders had no role in study design, data collection, and interpretation, or the decision to submit the work for publication.
We thank the Gilestro lab at Imperial College London and Robert Lind at Syngenta for useful discussions. Special thanks to Laurence Blackhurst for compiling the catch22 notebooks. HJ was supported by a BBSRC/CASE studentship in partnership with Syngenta (project reference BB/M011178/1/1958700). Stocks obtained from the Bloomington Drosophila Stock Center (NIH P40OD018537) were used in this study.
- Claude Desplan, New York University, United States
- John Ewer, Universidad de Valparaiso, Chile
You can cite all versions using the DOI https://doi.org/10.7554/eLife.86695. This DOI represents all versions, and will always resolve to the latest one.
© 2023, Jones et al.
This article is distributed under the terms of the Creative Commons Attribution License, which permits unrestricted use and redistribution provided that the original author and source are credited.
T cells are required to clear infection, and T cell motion plays a role in how quickly a T cell finds its target, from initial naive T cell activation by a dendritic cell to interaction with target cells in infected tissue. To better understand how different tissue environments affect T cell motility, we compared multiple features of T cell motion including speed, persistence, turning angle, directionality, and confinement of T cells moving in multiple murine tissues using microscopy. We quantitatively analyzed naive T cell motility within the lymph node and compared motility parameters with activated CD8 T cells moving within the villi of small intestine and lung under different activation conditions. Our motility analysis found that while the speeds and the overall displacement of T cells vary within all tissues analyzed, T cells in all tissues tended to persist at the same speed. Interestingly, we found that T cells in the lung show a marked population of T cells turning at close to 180o, while T cells in lymph nodes and villi do not exhibit this “reversing” movement. T cells in the lung also showed significantly decreased meandering ratios and increased confinement compared to T cells in lymph nodes and villi. These differences in motility patterns led to a decrease in the total volume scanned by T cells in lung compared to T cells in lymph node and villi. These results suggest that the tissue environment in which T cells move can impact the type of motility and ultimately, the efficiency of T cell search for target cells within specialized tissues such as the lung.
Biomedical single-cell atlases describe disease at the cellular level. However, analysis of this data commonly focuses on cell-type centric pairwise cross-condition comparisons, disregarding the multicellular nature of disease processes. Here we propose multicellular factor analysis for the unsupervised analysis of samples from cross-condition single-cell atlases and the identification of multicellular programs associated with disease. Our strategy, which repurposes group factor analysis as implemented in multi-omics factor analysis, incorporates the variation of patient samples across cell-types or other tissue-centric features, such as cell compositions or spatial relationships, and enables the joint analysis of multiple patient cohorts, facilitating the integration of atlases. We applied our framework to a collection of acute and chronic human heart failure atlases and described multicellular processes of cardiac remodeling, independent to cellular compositions and their local organization, that were conserved in independent spatial and bulk transcriptomics datasets. In sum, our framework serves as an exploratory tool for unsupervised analysis of cross-condition single-cell atlases and allows for the integration of the measurements of patient cohorts across distinct data modalities.