Defining host–pathogen interactions employing an artificial intelligence workflow

Abstract
Introduction
Results
Discussion
Materials and methods
Data availability
References
Article and author information
Metrics

Abstract

For image-based infection biology, accurate unbiased quantification of host–pathogen interactions is essential, yet often performed manually or using limited enumeration employing simple image analysis algorithms based on image segmentation. Host protein recruitment to pathogens is often refractory to accurate automated assessment due to its heterogeneous nature. An intuitive intelligent image analysis program to assess host protein recruitment within general cellular pathogen defense is lacking. We present HRMAn (Host Response to Microbe Analysis), an open-source image analysis platform based on machine learning algorithms and deep learning. We show that HRMAn has the capacity to learn phenotypes from the data, without relying on researcher-based assumptions. Using Toxoplasma gondii and Salmonella enterica Typhimurium we demonstrate HRMAn’s capacity to recognize, classify and quantify pathogen killing, replication and cellular defense responses. HRMAn thus presents the only intelligent solution operating at human capacity suitable for both single image and high content image analysis.

Editorial note: This article has been through an editorial process in which the authors decide how to respond to the issues raised during peer review. The Reviewing Editor's assessment is that all the issues have been addressed (see decision letter).

https://doi.org/10.7554/eLife.40560.001

Introduction

High content imaging (HCI) has revolutionized the field of host–pathogen interaction by allowing researchers to perform image-based large-scale compound and host genome-wide depletion screens in a high-throughput fashion (Brodin and Christophe, 2011; Mattiazzi Usaj et al., 2016). The majority of these screens assess host–pathogen interactions using bulk colorimetric or automated enumeration of pathogen growth at the population level (Ang and Pethe, 2016; Radke et al., 2018). Additionally, quantification of host–pathogen interaction (e.g. analysis of host protein recruitment to the pathogen) in general is often performed manually. However, to meaningfully dissect cell-mediated pathogen control, it is imperative to quantify the host response and pathogen fate at the single-cell level. Many open-source, e.g. CellProfiler (Carpenter et al., 2006), and proprietary, e.g. Perkin Elmer Harmony, analysis software packages have been developed and successfully employed for this purpose (Eliceiri et al., 2012; Stöter et al., 2013; Smith et al., 2018). To advance the state of the art in image analysis of host–pathogen interaction, incorporation of cutting-edge machine intelligence algorithms (Simonyan and Zisserman, 2014; LeCun et al., 2015) to stratify the image content without the requirement to program complex integrations is needed. HRMAn relies on the same well-established image segmentation strategies as many other programs but offers an intuitive integration of deep learning for more complex image analysis. Solutions existing to date can be split into two major categories: user-friendly turn-key GUI (TK-GUI)-based solutions and scripts ensembles (SE) solutions. Due to the large support burden of the TK-GUI, these programs lack the implementations of the latest engineering advances. At the same time SE solutions are easier to update but are far from user-friendly and are difficult to migrate between installations.

Deep neural network-based machine intelligence methods have brought about a revolutionary advance in the field of computer vision, by allowing for learning of complex morphologies in a highly generalized fashion (Krizhevsky et al., 2012; LeCun et al., 2015). To date, these methods have not been adapted for the field of host–pathogen interaction. Typically, HCI based fluorescent imaging data from a host–pathogen interaction experiment is analyzed by classical image segmentation (Osaka et al., 2012; Schmutz et al., 2013; Kühbacher et al., 2015; Ovalle-Bracho et al., 2015). Occasionally segmentation combined with machine learning based on calculated features has been employed (Kreibich et al., 2015). Most of these analysis pipelines make use of open-source programs tailored with additional coding by the user to suit their specific needs and are not published in their final form as a universal open-source solution. A major short-coming of these classical image segmentation and machine learning analysis methods is that they fail at the level of quantifying host protein recruitment to the pathogen. This is largely due to the fact that traditional algorithms quantify phenotypes in a rule-based manner, using bulk statistical properties of microscopy images or their segments. Conversely, deep neural networks make use of complex patterns (e.g. shapes) within the dataset to learn phenotypes and their diversity. The neural network derives these patterns in an automatic fashion from expert-labelled data. Thus, using pattern complexity to refine classification (Krizhevsky et al., 2012), deep neural networks improve the biological relevance of the phenotypic readouts.

While some proprietary solutions have been employed to extract host protein recruitment data, these are impractical and insufficient for most researchers as they are tied to single and expensive microscopes and do not operate at human capacity (Polajnar et al., 2017; Touquet et al., 2018). To date, for the analysis of host protein recruitment to pathogens, artificial intelligence-driven automated analysis is neither available as an open-source nor as a commercial package. Thus, there remains a need for an open-source, intuitive, flexible, and trainable host–pathogen interaction analysis software that performs at the level of human analytic capacity (Russakovsky et al., 2015; He et al., 2015; Haberl et al., 2018).

Here we present a high-throughput, high-content, single-cell image analysis pipeline that incorporates machine learning and a deep convolutional neural network (CNN) ensemble for Host Response to Microbe Analysis (HRMAn; https://hrman.org/). To assure its broad applicability to infection biology, HRMAn is based on the data integration environment KNIME Analytics Platform (Berthold et al., 2008). The analysis relies on training of machine learning algorithms and deep neural networks that can be tailored to individual researchers’ needs.

Results

Architecture of the high-content image analysis pipeline for analyzing host–pathogen interaction

The HRMAn pipeline (Figure 1), is designed to work with all file types acquired on any HCI platform or fluorescence microscope. Plate maps including experimental layouts, sample groups and replicates can be loaded, enabling HRMAn to automatically cluster results and perform error analysis. Once fed into the HRMAn pipeline, images are automatically pre-processed and clustered based on user-defined parameters (i.e. imaging specifications) and corrected for illumination. The subsequent image analysis proceeds in two stages: in stage 1, HRMAn segments images into pathogen and host cell features for single cell analysis. It then classifies these features using a decision tree learning algorithm previously trained on an annotated dataset. In stage 2, HRMAn analyzes host cell features associated with the pathogens using a CNN HRMAlexNet (derived from AlexNet architecture) trained to distinguish complex phenotypic patterns of host-protein recruitment (Krizhevsky et al., 2012). Robust classification is achieved by passing segmented regions of interest through multiple non-linear convolutional filters to identify characteristic phenotypic details.

Figure 1

Download asset Open asset

Overview of the HRMAn pipeline.

Following image acquisition, on a high-content imaging platform or any other fluorescence microscope, the images can be loaded into the HRMAn software. First, the data is pre-processed and clustered based on user-defined parameters and provided plate maps. Images then undergo illumination correction and automated segmentation using Huang algorithm. Segmented images are used by a deep convolution neural network (CNN) and other machine learning based algorithms to analyze infection of cells with intracellular pathogens. Depicted is the CNN diagram representing two-dimensional convolutional filters with respective width, height and depth designated on filters facets. Respective change of stride in the groups of hidden layers is depicted above the diagram, while respective activation functions below the diagram. Finally, the data is written as a single file and will provide the researcher with more than 15 different readouts that describe the interaction between pathogen and host cell during infection. HRMAn is based on the open-source data integration environment KNIME Analytics Platform making it modular and adaptable to a researcher’s needs. The analysis is based on training of the machine learning algorithms generating high flexibility, which can be tailored to the needs of the user.

https://doi.org/10.7554/eLife.40560.002

Finally, data are output as a single spreadsheet file providing the researcher with ≥15 quantitative descriptions of a pathogen and its interaction with host factors at population and single cell levels (Figure 1; Readouts). Importantly, by separating the analysis, HRMAn offers researchers the flexibility to perform fast, simple quantitative analysis of infection parameters using stage 1, without analyzing host protein recruitment.

Machine learning and a convolutional neural network drive classification of pathogen replication and host defense

To train for detection and analysis of host–pathogen interactions, HRMAn was provided an annotated dataset of host cells infected with an eGFP-expressing version of the parasite Toxoplasma gondii (Tg) and stained for various host cell features (Figure 2A) (Seibenhener et al., 2004; Clough et al., 2016). For stage one pathogen detection and enumeration training a simplistic ML strategy – decision tree performed remarkably well. Over 35,000 Tg-vacuoles were analyzed by decision tree, gradient boosted tree, and random forest machine learning classification algorithms and cross-validated (Figure 2B). As each performed equally, a simple decision tree with Minimum Description Length (MDL) pruning, to limit overfitting, was employed for speed and accuracy of pathogen detection (>99.5% for Tg). Using these parameters, in addition to the readouts from stage 1 (see Figure 1), HRMAn detected and quantified Tg-containing vacuoles harbouring 1, 2, 4 or >4 fluorescent Tg (Figure 2C).

Figure 2 with 1 supplement see all

Download asset Open asset

Decision-tree and convolutional neural network training for pathogen replication and host defense protein recruitment analysis.

(A) Example images from one field of view. A composite image of all channels (blue: nuclei, green: Tg, red: Ubiquitin, grey: p62) and the single channel images are shown. Scale bar indicates a distance of 30 μm. (B) Training and cross-validation of different machine learning classification algorithms to predict parasite replication. (C) Example images of different vacuoles with the resulting classification of a trained decision tree classifier. Scale bar, 5 μm. (D) Resulting classification of the trained deep convolution neural network (CNN) with example vacuoles. For the recruited classification a class activation map (CAM) is depicted to illustrate the focus of the CNN. (E) Decrease of negative log likelihood (NLL) used as loss function during CNN training over training cycles (epochs) for *Toxoplasma gondii* model (left) and confusion matrix of *Toxoplasma gondii* model validation illustrating classification accuracy of labelled data unseen by the model, classification accuracy (0 to 1) during validation is colour-coded blue to red and indicated in the figure (right).

https://doi.org/10.7554/eLife.40560.003

For stage 2, host protein recruitment, the CNN was trained for ubiquitin and p62 recruitment using segmented Tg vacuoles defined in Stage 1. Robust classification of host protein recruitment was achieved by passing these regions of interest through multiple non-linear filters to identify and differentiate between no recruitment, recruitment, and analysis artefacts (Figure 2D). Training over 80 epochs with negative log likelihood as a loss function, the deep CNN achieved 92.1% classification accuracy confirmed by expert-based cross-validation. Precision for ‘no recruitment’, ‘recruitment’, and ‘artefacts’ classes was 0.92, 0.92 and 0.71, while recall was 0.94, 0.89 and 1 respectively, hence achieving the accuracy of a human operator and far exceeding human capacity (Figure 2E).

To assure that uninvaded Tg parasites do not skew the data, stringent synchronization of infection by centrifugation and washing procedures were employed. In a pilot experiment (Figure 2—figure supplement 1), staining with the Tg vacuole marker GRA2 (Figure 2—figure supplement 1A–B) revealed that more than 98% of all parasites captured in the images have successfully invaded and established a PV, irrespective of the Tg strain used for infection (Figure 2—figure supplement 1B). Using a multiplicity of infection (MOI) of 3 for experiments resulted in up to 90% type I and 80% type II Tg infected host cells (Figure 2—figure supplement 1C). In line with this, we often observed that a single host cell can contain more than one PV.

HRMAn allows for accurate high-throughput analysis of the host defense response to Toxoplasma

To demonstrate the ability of HRMAn and to expand how researchers define and classify host–pathogen interactions, the impact of IFNγ on Tg replication and ubiquitin/p62 recruitment to Tg vacuoles was analyzed (Figure 3).

Figure 3 with 7 supplements see all

Download asset Open asset

Analysis of *Toxoplasma gondii* infection in IFNγ-treated HeLa cells.

HeLa cells were stimulated with 100 IU/mL IFNγ, infected with type I (RH) *Toxoplasma gondii* (Tg) and analyzed 6 hr post-infection. (A) Infection parameters depicted as total percent of Tg infected cells, the ratio of Tg vacuoles to cells and the ratio of parasites to cells. (B) Cellular readouts showing the proportion of cells that contain a varying numbers of parasite vacuoles, the mean vacuole size of Tg and the vacuole position as the value of the mean Euclidian distance of Tg vacuoles to the host cell nucleus. (C) Replication capacity of Tg shown as the proportion of replicating parasites and the distribution of replicating Tg. (D) Cellular response to infection with Tg measured as the percentage of cells that decorate vacuoles and the average proportion of vacuoles per cell that are being decorated simultaneously and the overall proportion of ubiquitin and/or p62 decorated Tg vacuoles. N shows the total number of vacuoles analyzed for each condition, percentages are indicated in the legend. (E) Properties of the host protein coat on Tg vacuoles as the average coat distance for ubiquitin and p62 to Tg and mean fluorescence intensity of ubiquitin and p62 at Tg vacuoles. (F) Fate of Tg vacuoles grouped based on host protein recruitment. The proportion of replicating parasites and the replication distribution based on recruitment status of the vacuole are shown. All data shown above represent the mean of N = 3 experiments±SEM. Significance was determined using unpaired t-tests, n.s. = not significant, *p≤0.05; **p≤0.01, ***p≤0.001, ****p≤0.0001.

https://doi.org/10.7554/eLife.40560.005

Previous reports indicate that HeLa cells restrict the growth of Tg through ubiquitination of parasitophorous vacuoles and subsequent non-canonical, p62-dependent autophagy (Selleck et al., 2015; Clough et al., 2016). HeLa cells infected with eGFP Tg ±IFNγ were fixed 6 hr post-infection (hpi) and stained with Hoechst (nuclei) and antibodies directed against ubiquitin and p62. A total of 1,350 4-colour images were acquired on an automated microscope and loaded into HRMAn for analysis.

HRMAn automatically detected and analyzed more than 15,000 HeLa cells resulting in 15 quantitative outputs of host–pathogen interaction (Figure 3). Population level readouts from stage one indicated that IFNγ treatment did not impact the percentage of infected cells but decreased the number of vacuoles within host cells as well as the number of parasites per cell (Figure 3A). As eGFP fluorescence is lost when parasites are killed, a reduction in the ratio between vacuoles and host cells serves as an indirect measurement for parasite killing. At the single cell level, HRMAn found that IFNγ treatment resulted in a significant reduction of vacuoles per cell and a minor reduction in mean vacuole size, without impacting vacuole position (Figure 3B). Concomitant with this reduction in vacuole size, both the percentage of replicating parasites, and the number of parasites per vacuole were significantly reduced by IFNγ treatment (Figure 3C). Thus, IFNγ−mediated control of Tg in HeLa cells involves both parasite-killing and restriction of Tg replication. Importantly, HRMAn offers a wide range of readouts in stage 1 analysis allowing for more detailed information on the dynamics of infection and clearance than typically seen with manual counting. To allow the user to decide which readouts are best suited to answer their specific research question some redundancy has been purposely built in (e.g. mean vacuole size vs. % Replicating). For example, here we focused on parasites per vacuole and the proportion of infected cells, as opposed to the number of individual vacuoles per host cell.

In stage 2, analysis of the >25,000 vacuoles identified in stage 1, showed that the number of host cells with ubiquitin/p62-positive vacuoles and the percentage of ubiquitin/p62-positive vacuoles per host cell increased with IFNγ (Figure 3D). Distribution analysis indicated that in untreated cells, only 5.92% of vacuoles were decorated with ubiquitin, p62, or both. This number rose to 27.61% in IFNγ-treated cells, the majority of which (20.92%) were double-positive for ubiquitin/p62 (Figure 3D). By quantifying the radial fluorescence intensity distribution of these host factors, HRMAn revealed that ubiquitin was more closely associated with Tg vacuoles than p62 and that recruitment of both was increased by IFNγ treatment (Figure 3E). This is in agreement with the notion that p62 binds a ubiquitinated vacuole substrate through its UBA domain (Seibenhener et al., 2004; Clough et al., 2016). Finally, by analyzing vacuoles that recruit ubiquitin/p62, HRMAn indicated that restriction of Tg replication occurs in vacuoles decorated with these host defense proteins (Figure 3F). Collectively, this data indicates that in HeLa cells, IFNγ drives both parasite killing as well as recruitment of ubiquitin/p62 to Tg vacuoles, which acts to restrict parasite replication (Figure 3). The results demonstrate the capacity of HRMAn to provide a quantitative, multi-parametric readout of host–pathogen interaction at population and single-cell levels.

As a high-throughput, high-content analysis program, HRMAn removes experimental size constraints imposed by manual quantification. To illustrate this, HRMAn was used to systematically analyze the impact of IFNγ treatment on type I and type II Toxoplasma strains in five human cell lines: HeLa (cervical carcinoma epithelial), PMA-differentiated THP-1 (macrophage-like), A549 (lung carcinoma epithelia), HFF (primary fibroblasts), and HUVEC (primary endothelial cells) (Figure 3—figure supplements 1–7).

First, stage 1 HRMAn was used to ascertain the impact of varying concentrations of IFNγ (50–500 IU/ml) on Tg infection, killing, and replication. (Figure 3—figure supplement 1). For each host cell line (Figure 3—figure supplement 1A), a dose-dependent reduction in Tg infection was seen (Figure 3—figure supplement 1B). Assessment of the vacuole:cell ratio and mean vacuole size indicated that THP-1s, HFFs, and HUVECs limit infection by IFNγ-dependent Tg killing, while HeLas and A549s do so by restricting replication (Figure 3—figure supplement 1C–D). Quantification of the number of parasites per vacuole indicated that HeLas and A549s acutely restrict type I and type II Tg replication at all concentrations of IFNγ (Figure 3—figure supplement 2B–C), while THP-1s, HFFs, and HUVECs are far more limited in this capacity (Figure 3—figure supplement 2A,D–E).

Next, HRMAn was employed on all 5 cell lines infected with either type I and type II Tg ±100 IU/ml IFNγ and immuno-stained for ubiquitin and p62. Figure 3—figure supplement 3–7 display the 15 quantitative readouts compiled by HRMAn of 9000 fields of view (~90 GB) and >175,000 vacuoles identified in stage 1. Taking advantage of the large-scale capabilities of HRMAn, we found that all host cell types can mediate IFNγ-dependent type I and II Tg killing (Figure 3—figure supplement 3B–C), and growth restriction (Figure 3—figure supplement 4A–B) to similar levels. Tg vacuoles show strain-dependent (A549, HUVEC), and strain-independent (HFFs) IFNγ-stimulated movement towards the nucleus (Figure 3—figure supplement 4C). HRMAn revealed that type II Tg grew slower than type I Tg in each host cell line and that their growth decreased more upon treatment with IFNγ (Figure 3—figure supplement 5A–B). Consistent with this, stage 2 HRMAn showed that all cell types could recruit ubiquitin and/or p62 equally well (Figure 3—figure supplement 6A), while a greater percentage of type II vacuoles per cell were decorated in response to IFNγ-priming (Figure 3—figure supplement 6B). The exception to this were THP-1 cells, which did not mount a strain-specific response (Figure 3—figure supplement 6B). Distribution analysis further indicated that THP-1s display a higher intrinsic capacity to decorate Tg vacuoles than other cell lines, even in the absence of IFNγ (Figure 3—figure supplement 6C). While no cell-type dependent differences in ubiquitin or p62 coat distance were observed (Figure 3—figure supplement 7A), THP-1s not only decorate vacuoles with more ubiquitin upon IFNγ stimulation, they also appear to recruit p62 in an IFNγ-independent fashion (Figure 3—figure supplement 7B). Decorated vacuoles in all host cell types displayed a greater ability to restrict the growth of type II versus type I Tg upon IFNγ treatment (Figure 3—figure supplement 7C–D). These results highlight the ability of HRMAn to provide high-throughput and quantitative single-cell analysis of host–pathogen interactions at a scale not achievable by automated bulk or manual quantification.

HRMAn can be adapted for bacteria-host interaction analysis

To demonstrate its flexibility, HRMAn was trained to recognize the bacterium Salmonella enterica Typhimurium (STm) - a pathogen 16x smaller than Tg (0.5 μm vs. 8 μm) - and then set to analyze the impact of IFNγ on bacterial killing, replication, and ubiquitin recruitment. Stage 1 outputs showed that similar to Tg (Figure 3), IFNγ treatment in HeLa cells reduced the ratio of STm vacuoles/cell and the bacterial load, without impacting the percent of infected host cells (Figure 4A). At the single cell level, HRMAn found a significant reduction in the number of STm vacuoles/cell, consistent with a reduction in vacuole size, percent of replicating bacteria, and reduced numbers of STm/vacuole (Figure 4B–C). These results demonstrate that HeLa cells can control infection with STm through IFNγ−dependent bacterial killing and growth restriction.

Figure 4

Download asset Open asset

Analysis of *Salmonella enterica* Typhimurium infection in IFNγ-treated HeLa cells.

HeLa cells were stimulated with 100 IU/mL IFNγ, infected with *Salmonella enterica* Typhimurium (STm) and analyzed 2 hr post-infection. (**A–C**) Stage one infection analysis parameters. (A) Infection parameters depicted as total percent of STm infected cells, the ratio of STm vacuoles to cells and the ratio of bacteria to cells. (B) Cellular readouts showing the proportion of cells that contain a certain number of bacteria vacuoles, the mean vacuole size of STm and the vacuole position as the value of the mean euclidian distance of STm vacuoles to the host cell nucleus. (C) Replication capacity of STm shown as the proportion of replicating bacteria and the distribution of replicating STm. (D) Training of the deep convolution neural network (CNN) to analyze host protein recruitment to STm vacuoles and bacteria. Left: Example images showing the difference of no recruitment versus ubiquitin (magenta) recruitment to STm. Middle: Decrease of negative log likelihood (NLL) used as loss function during CNN training over training cycles (epochs) for STm model. Right: Confusion matrix of STm model validation, classification accuracy (0 to 1) during validation is colour-coded blue to red and indicated in the figure. (E) Cellular response to infection with STm measured through the percentage of cells that decorate vacuoles and the average proportion of vacuoles per cell that are being decorated simultaneously and the overall proportion of ubiquitin decorated STm vacuoles. N shows the total number of vacuoles analyzed for each condition, percentages are indicated in the legend. (F) Properties of the host protein coat on STm vacuoles as the average coat distance for ubiquitin to STm and mean fluorescence intensity of ubiquitin. (G) Fate of STm grouped based on host protein recruitment. Shown is the proportion of replicating bacteria and the replication distribution based on recruitment status of the vacuole. All data shown above represent the mean of N = 3 experiments±SEM. Significance was determined using unpaired t-tests, n.s. = not significant, *p≤0.05; **p≤0.01, ***p≤0.001, ****p≤0.0001.

https://doi.org/10.7554/eLife.40560.013

For stage 2, we used the Tg recruitment model as input to retrain HRMAn for quantification of ubiquitin recruitment to STm (Figure 4D). This allowed us to achieve 69.9% classification accuracy, confirmed by expert-based cross-validation, in just 40 epochs using 10-fold less non-augmented image data (Figure 4D). It’s known that HeLa cells restrict STm growth by maintaining vacuole integrity; the small percentage of bacteria that escape vacuoles are decorated with ubiquitin and subsequently cleared by autophagy (Noad et al., 2017; van Wijk et al., 2017). Interestingly, stage 2 HRMAn showed that the percent of host cells which recruit ubiquitin to STm doubles upon IFNγ treatment, while the percent of decorated vacuoles/cell increases only slightly (Figure 4E). As seen with Tg (Figure 3E, Figure 3—figure supplement 7A), IFNγ does not impact the distance of the ubiquitin coat to STm but increases its thickness (Figure 4F). This indicates that more ubiquitin is recruited to cytosolic STm in the presence of IFNγ and growth of decorated bacteria was restricted (Figure 4G). Consequently, although IFNγ treatment increases the number of host cells that recruit ubiquitin to STm and the intensity of that recruitment, at the single-cell level HeLa cells appear to have reached their capacity for detection and autophagy-mediated clearance of cytosolic/ubiquitinated STm independent of IFNγ treatment (Figure 4E–G).

HRMANs versatility allows for rapid adaption to study pathogen biology

To illustrate the versatility of HRMAn and the advantage of the modular architecture combined with the accessible user interface that comes with the KNIME Analytics platform, we performed experiments to stress HRMAn’s applicability and adaptability to study pathogen-driven parameters of infection. Using transgenic Tg lines expressing different parasite virulence factors, we were able to reproduce and expand upon published data (Virreira Winter et al., 2011). We confirmed that expression of ROP16 from type I Tg or lack of GRA15 in otherwise isogenic type II parasites (PruA7) reduces the recruitment of murine guanylate binding proteins (Gbps) to the vacuoles. Similarly, expression of type I ROP18 in type III parasites (CEP) also reduced recruitment of murine Gbps 1, 2 and 5 compared to isogenic type III parasites (Figure 5A). This analysis shows that HRMAn can be used to study effects of pathogen effector proteins on an established host phenotype.

Figure 5

Download asset Open asset

HRMAn can be adapted to study pathogen biology.

(A) HRMAn-based quantification of Gbp recruitment to Tg vacuoles. Red lines show mean ± SEM of N = 3 experiments. (B) Quantification of Tg protein secretion and translocation to the host cell nucleus. HFF cells were infected with type I Tg expressing GRA16-HA or GRA24-HA or with type II Tg expressing TgIST-HA and fixed after 18 hr. Secreted proteins were visualized by staining with anti-HA (magenta) and Tg was stained with anti-SAG1 (green). Scale bar, 20 μm. Fluorescence signal in the host cell nucleus was quantified, correlated to the number of parasites per cell and normalized to the signal of uninfected cells. Overall number of analyzed cells are indicated in the graphs. Data represented as mean ± SEM of N = 3 experiments. Significance from one-way ANOVA comparing to the respective WT, n.s. = not significant, *p≤0.05; **p≤0.01, ***p≤0.001, ****p≤0.0001.

https://doi.org/10.7554/eLife.40560.014

Next, we asked whether HRMAn can accurately measure parasite effector proteins targeted elsewhere within infected host cells. Tg is known to secret multiple effector proteins upon invasion and during replication. Examples include the parasite proteins GRA16, GRA24 and TgIST, that are secreted beyond the boundaries of its PV and subsequently translocated to the host cell nucleus (Bougdour et al., 2013; Braun et al., 2013; Gay et al., 2016). Using HRMAn, we were able to visualize accumulation of these three Tg effector proteins (tagged with an HA-tag) in the nucleus of the host cell. HRMAn, further indicated that the levels of nuclear accumulation correlated to the number of parasites contained within the infected cell (Figure 5B). Thus, HRMAn can be employed to analyze both host and parasite parameters during infection in an unbiased, accurate and high content manner.

Discussion

Recent advances have made deep CNNs a powerful image analysis method (Simonyan and Zisserman, 2014; He et al., 2015; Ioffe and Szegedy, 2015; LeCun et al., 2015; Russakovsky et al., 2015; Haberl et al., 2018). Inspired by abstraction of animal visual cortex architecture, CNNs are able to generalize patterns independent of minor phenotypic differences (Hubel and Wiesel, 1968; Matsugu et al., 2003). Combining automated image segmentation, machine learning and a deep CNN in an ensemble, HRMAn is a powerful open-source, user-friendly software for the analysis of host–pathogen interaction at the single-cell level. We based HRMAn on the KNIME Analytics Platform (Berthold et al., 2008). Being highly modular, GUI-based and user-friendly, HRMAn can rapidly be updated with latest technological advances, yet remains transparent for the average user. Furthermore, the ready-to-use DL4J library modules we employed allow for incorporation of the latest advances in the field of artificial intelligence in a click-through manner with zero coding. To date, HRMAn represents the only open-source CNN-driven host-pathogen analysis solution for fluorescent images. While HRMAlexNet is a rather simple architecture, more complex architectures can be easily implemented through recently introduced KNIME-Keras integration (Chollet, 2015), which may facilitate improvement in the classification accuracy. This is important, as it moves the phenotype of host defense protein recognition of pathogens into the realm of HCI at the level of artificial intelligence and thus human accuracy and capacity. Many automated image analysis programs, some of which incorporate machine learning elements, have been developed and are successfully used for classical image segmentation (Supplementary File 1). However, when presented with the problem of classifying host protein recruitment to a pathogen, inaccurate classical image segmentation could lead to erroneous results. Employing an artificial intelligence algorithm, HRMAn circumvents these problems and delivers user-defined automated and unbiased enumeration of this subset of the host-pathogen interplay.

Using Tg and STm infection models, we demonstrate that HRMAn is capable of detecting and quantifying multiple pathogen and host parameters. Importantly, we show that HRMAn can be adapted easily to two entirely different pathogens, that not only differ in size by a magnitude, but also display distinct growth rates and infection dynamics. The easy adaption of HRMAn for different pathogens and research questions will prove useful for any lab working in image-based infection biology. Designed for biologists, HRMAn requires no coding or specialized computer science knowledge. Its modular architecture and the use of KNIME, which provides a graphical representation of the analysis pipeline, allows users to tailor experimental outputs to their own datasets and questions. As the models we have generated can be used as primers to lower the training dataset size, computation power and training time requirements, HRMAn can be rapidly applied to similar large-scale, image-based experimental setups. As such, HRMAn will allow a broad range of researchers to extend into the realm of high-throughput, unbiased, quantitative single-cell analysis of host–pathogen interaction.

Share this article

Cite this article

Overview of the HRMAn pipeline.

Decision-tree and convolutional neural network training for pathogen replication and host defense protein recruitment analysis.

Analysis of Toxoplasma gondii infection in IFNγ-treated HeLa cells.

Analysis of Salmonella enterica Typhimurium infection in IFNγ-treated HeLa cells.

HRMAn can be adapted to study pathogen biology.

Author details

Daniel Fisch

Contribution

Contributed equally with

Competing interests

Artur Yakimovich

Contribution

Contributed equally with

Competing interests

Barbara Clough

Contribution

Competing interests

Joseph Wright

Contribution

Competing interests

Monique Bunyan

Contribution

Competing interests

Michael Howell

Contribution

Competing interests

Jason Mercer

Contribution

Competing interests

Eva Frickel

Contribution

For correspondence

Competing interests

Citations by DOI

Downloads (link to download the article as PDF)

Open citations (links to open the citations from this article in various online reference manager services)

Cite this article (links to download the citations from this article in formats compatible with various reference manager tools)

Categories and tags