Abstract
Numerous studies have shown that viral variants that elude the host immune response may incur a fitness expense, diminishing the survival of the viral strain within the host, and the capacity of the variant to survive future transmission events. Furthermore, co-occurring mutations outside the epitope regions targeted by the immune response may increase or decrease the likelihood of survival of the variant (known as epistasis). Analysis of viral fitness and epistasis over the non-structural protein regions is lacking for hepatitis C virus (HCV). Here, using a rare cohort of subjects very recently infected with HCV, we build upon our prior investigations by integrating mathematical modelling and experimental data to examine the interplay between the evolving transmitted/founder (T/F) viruses, the adaptive immune response, viral fitness, and co-occurring mutations. We show that viral fitness decreases during the first 90 days post-infection (DPI) associated with the magnitude of CD8+ T-cell responses and the initial level of diversification. Thereafter, viral fitness rebounds in a complex pattern of evolution characterized by multiple sets of co-occurring mutations. Finally, we show that an early and strong CD8+ T-cell response in the absence of neutralizing antibodies (nAbs) imposes a strong selective force on the T/F virus population, enabling the virus to escape and establish chronic infection. Understanding these dynamics is highly relevant for HCV vaccine design and supports a vaccine strategy that induces broad immunity targeting both T and B cell responses.
Introduction
Hepatitis C virus (HCV) is a major cause of chronic liver disease globally (1). Following acute infection, approximately 75% of people fail to clear the virus, resulting in chronic hepatitis with progressive fibrosis and ultimately cirrhosis, liver failure, and an increased risk of hepatocellular carcinoma (2–5). Despite the arrival of direct-acting antiviral agents (DAA) with remarkable efficacy, the unresolved financial and health service challenges in ensuring universal DAA access to those infected globally, and likelihood of reinfection in high risk populations, necessitate a prophylactic vaccine as an essential component for the WHO goal of global elimination of HCV infection as a public health threat(6).
Due to a high mutation rate during replication, the HCV genome is highly diverse, being classified into eight major genotypes (genotypes 1 to 8), and 86 subtypes (labeled alphabetically [a, b, c, etc.])(7, 8). HCV also exists within each infected host as a diverse, rapidly evolving population termed a quasispecies. Nevertheless, transmission of HCV is associated with a strong genetic bottleneck with as few as 1-3 transmitted/founder (T/F) viruses commonly establishing infection in the new host, despite hundreds of individual variants found in the source (9, 10). This is followed by a second genetic bottleneck at ∼100 days post infection (DPI), where T/F viruses become undetectable, but are replaced by new variants in those who progress to chronic infection, which carry mutations within B-cell and CD8+ T-cell epitopes (10). Indeed, clearance of HCV has been associated with an early onset of T/F-specific neutralizing antibodies (nAbs) (11–14), while early and strong CD8+ T-cell responses against the T/F virus have been found to drive the generation of immune escape mutations found within MHC-I restricted epitopes leading to progression to chronic infection (15, 16).
Numerous virological studies have shown that viral variants that elude the host immune response might incur a fitness expense, diminishing the survival of the viral strain, and the reducing the capacity of the variant to survive future transmission events as the variants mutate away from the T/F virus (17, 18). In HCV, the structural envelope proteins, E1 and E2, which are predominantly targeted by B-cell response including neutralizing antibodies (nAbs), have been found to collectively mediate viral fitness (19). This interdependence suggests that mutations in the E1 protein could potentially compensate for fitness deficits, thereby facilitating the virus’ ability to evade antibodies that specifically target the E2 protein (19). Furthermore, comparative analyses of E2 fitness in genotype 1a and genotype 1b sequences found that it may be easier for subtype 1b viruses to evade immune responses (20, 21). By contrast, a comprehensive analysis of the non-structural protein regions (NS1, NS2, NS3, NS4A, NS4B, NS5A, NS5B) in mediating viral fitness is lacking. For the scope of this study, a fitness model developed for HIV (22, 23) and modified for HCV (24) was used to estimate the fitness landscape of HCV subtypes. Sequences in clinical databases were used to represent the circulating viral variants with the assumption that commonly observed viral variants denoted higher fitness. This model was then used to infer the initial fitness landscape by providing the T/F virus and producing a fitness value. The fitness of the subsequent variants was then described as the relative fitness in comparison to the T/F virus, over the course of an infection.
Mutations that can incur a fitness expense can typically be alleviated or compensated by other mutations, making certain combinations of mutations beneficial (25). These co-occurring mutations can arise outside of the epitope regions and facilitate their compensatory effects, thus increasing the likelihood of survival of the variant (26). This phenomenon is called positive epistasis. However, epistasis may also be negative if a combination of mutations provides a smaller fitness gain than anticipated based on the additive effects of the individual mutations. Nevertheless, epistatic interactions in HCV have not been studied over acute and chronic phase sequences to infer HCV evolution over the course of the infection. Understanding viral fitness and epistatic interactions in HCV is critical in anticipating future substitutions that are antigenically significant, thus aiding in not only mapping the potential landscape of evolution, but also an improved strategic approach for the selection of potent vaccine strains.
Previously, we longitudinally deep sequenced HCV in 14 individuals, who were recently infected with HCV, to identify the T/F virus and the evolving quasispecies during the infection (15, 16). We identified Human Leucocyte Antigen class I (HLA-I) epitopes from the T/F viruses and the mutated variants. Experimental testing using IFN-γ ELISpot revealed that a strong CD8+ T-cell response was linked to rapid immune evasion (15, 16). Here, we build upon our prior investigations by integrating mathematical models and experimental data to examine the interplay between evolving T/F viruses, the adaptive immune response, viral fitness, and co-occurring mutations.
Results
Prediction, generation, and validation of HLA class I-restricted epitopes
A total of 14 subjects with primary HCV infection followed from pre-seroconversion until infection outcome were included in this study (Table 1). The first available viremic sample was collected at a median of 68 days post infection (DPI) (range 2 – 337 DPI). Of the 14 subjects included in this study, 8 subjects naturally cleared the infection (termed here clearers) and 6 subjects progressed to chronic infection (termed here chronic progressors). The median time to natural clearance was 163 DPI (range 69–357). Next generation sequencing (NGS) data were available for eight viremic time points from these subjects to study HCV RNA populations in depth. The T/F viruses were estimated from the distribution of variants at the earliest sampling time point as previously described (10, 11, 15). Thirteen of the 14 subjects had a single T/F virus, with only subject 300023 having two T/F viruses identified, as previously published (10, 11, 15). T/F genomes were utilized to predict HLA-I epitopes from the T/F viruses and the mutated variants (16). These epitopes were tested and validated for IFN-γ ELISPOT (Figure S1) (16).
To examine the role of CD8+ T-cell responses in driving viral evolution, longitudinal deep sequenced data was analyzed for fixation events as previously described (16). For those subjects who ultimately spontaneously cleared HCV infection, there were no non-synonymous fixation mutations (defined as having frequency of occurrence greater than or equal to 70% in the sequencing data) occurring within epitope regions (Table S1). Therefore, subjects who cleared infection were not included for further analysis of viral fitness and immune escape throughout this manuscript. From the 6 subjects who developed chronic infection, non-synonymous fixation mutations occurring within epitope regions were observed and peptide epitopes were selected (Table S2). A total of 494 out of 10945 (4.51%) potential epitopes across the six chronic subjects (45 - 100 epitopes per subject) were selected and tested for IFN-γ ELISPOT (Table S2, Figure S1). Of those selected for testing, 30 (6.68%) epitopes induced positive responses in the IFN-γ ELISPOT, and of these, 14 epitopes showed evidence of escape (as indicated by reduced recognition in the IFN-γ ELISPOT assay) (Figure 1, Table S2, Figure S1).
The location of these 14 epitopes associated with escape were mostly within the non-structural regions of the HCV genome, with 7 (50%) epitopes identified in NS3 region, 3 (21.4%) in NS5B and 2 (14.3%) in NS2. One (7%) epitope was identified in the Core region and one (7%) in E2 which were both found in subject 300256 (Table 2). Single fixation events were observed in the majority of these epitopes, with the exception of two epitopes (in subjects HOKD0485FX and 300256) where two fixation events occurred (Table 2).
CD8+ T-cell responses contribute to the diversification of the viral population
In these same subjects, we previously found that high magnitude of IFN-γ responses were associated with rapid viral immune escape (16). To expand on this data and determine whether viral diversity correlated with the rate of immune escape, Shannon entropy (SE) was calculated (10) across each protein region at the preceding time points to those identified to have fixation in T-cell targeted epitopes, and tested (Table 2) against the rate of escape (ɛ) estimated from the distribution of viral mutations in the NGS data. A significant positive correlation was found between SE and the rate of escape (Pearson’s correlation = 0.73, p-value = 0.01), indicating the higher diversity was associated with the occurrence of immune escape (Figure 1D).
To determine whether CD8+ T-cell pressure was driving genomic diversification before and after the genetic bottleneck, the average IFN-γ ELISPOT response within each subject was calculated across multiple epitopes in the same protein region at each time point. This data was then correlated with global SE at two periods, <90DPI and >90DPI. At >90DPI a significant positive correlation was found (Pearson’s correlation = 0.83, p-value = 0.01, Figure 1E), indicating that a stronger IFN-γ ELISPOT response was associated with higher diversity following the second genetic bottleneck. Interestingly, this relationship was not evident when examining time points at <90DPI (Pearson’s correlation = 0.33, p-value = 0.23, Figure 1F). When considering both pre- and post-90DPI periods, diversity appeared to be influenced by distinct selective pressures where the CD8+ T-cell response plays a significant role in shaping viral population diversity, along with the emergence of epitope escape variants at >90DPI.
Viral fitness decreases during the first 90DPI and is associated with the magnitude of CD8+ T-cell responses and the initial level of diversification
To understand how CD8+ T-cell responses influence the evolution of viral fitness in the acute phase of infection and how the viral population adapts to the host, a quantitative analysis of viral fitness in viral haplotypes on longitudinal samples was performed. We utilized a fitness model initially developed for HIV and adapted for HCV to estimate the fitness landscape of various HCV subtypes. This model was then applied to infer the initial fitness landscape by providing the T/F virus, which generated a fitness value. The fitness of subsequent variants was calculated as relative fitness compared to the T/F virus over the course of the infection. Specifically, fitness of the viral population was estimated for the six chronic progressors (Table 2) in regions NS2, NS3 and NS5B and for each epitope, the population average relative fitness of the viral population over the course of infection was measured (Table 3, Table 4, Table S3 – Table S6).
In general, in the six subjects who progressed to chronic infection, there was a decrease in average relative fitness in the period of <90DPI with respect to the T/F virus (Figure 2). This was with the exception of subject 300023 which can be explained by the presence of two T/F viruses (Figure 2A) where the observed mutations responsible for the increase in viral fitness were present in the genome of the second T/F virus detected at 44DPI carrying substitutions H2750Q and T2917A (Table S3, Figure S2, Figure 2A) (10, 11).
To understand the observed reduction in viral fitness within the first 90DPI, a statistical analysis was performed to examine the relationship between population fitness and immune escape (ɛ), IFN-γ ELISPOT responses and SE values (Figure 3). In the <90DPI period, a significant positive correlation (Pearson’s correlation = 0.84, p-value = 0.0004) was identified between population average relative fitness and IFN-γ ELISPOT response (Figure 3A). A significant negative correlation between fitness estimate and SE (Pearson’s correlation = -0.48, p-value = 0.01) was also identified in the <90DPI period (Figure 3B). No significant correlation was observed between ɛ and fitness estimates at <90 DPI (Pearson’s correlation = 0.34, p-value = 0.32) (Figure 3C).
Similar to the analysis performed at <90DPI, correlation tests were performed to understand the change in fitness at >90DPI (Figure 3). No significant correlations were observed between the average relative fitness estimates and IFN-γ ELISPOT (Pearson’s correlation = 0.55, p-value = 0.25, Figure 3D), SE (Pearson’s correlation = 0.26, p-value = 0.26, Figure 3E) and ɛ (Pearson’s correlation = 0.39, p-value = 0.31, Figure 3F).
Overall, these results suggest that in the early phase when the T/F virus was the major variant, a fit viral population was targeted by a strong CD8+ T-cell response. Mutations away from the T/F virus then reduced fitness. Insignificant correlations observed in >90DPI also confirmed that once the chronic phase population had adapted against host immune response, immune pressure showed less impact on viral population fitness and diversity when compared to <90DPI.
Viral fitness rebounds with co-occurring mutations after the second genetic bottleneck at >90DPI
After the second genetic bottleneck at >90DPI, when viral load begins to rise, the distribution of relative fitness continued to decrease in most genomic regions analyzed for all six chronic progressors (Figure 2). However, THDS1086MX and THGS0684MX showed a contrasting pattern where the relative fitness measured for the NS3 region exceeded the fitness measured for the T/F virus while the relative fitness decreased for the viral variants in comparison to the fitness of the T/F virus in the NS5B region (Figure 2E-F). Of note, THDS1086MX and THGS0684MX are purported to be recipients from the same donor and share an identical consensus sequence at <16DPI (11, 27). Furthermore, the pair carry the same HLA-A alleles but different HLA-B alleles (Table 1). We wanted to explore the NS3 regions of THDS1086MX and THGS0684MX to further understand the specific mutations contributing to their relative fitness rebound and to elucidate the mechanisms driving viral evolution and adaptation in these individuals.
Further analysis revealed a complex pattern of evolution characterized by multiple sets of co-occurring mutations (Figure 4, Table 3). In subject THDS1086MX, at 16DPI, viral haplotypes did not carry any co-occurring mutations. However, at 72DPI, a set of three mutations (H1115T, V1458A, C1318Y) were found to be co-occurring within the same viral variant, with its relative fitness estimated to be inferior to the T/F virus (relative fitness = 0.069) (Table 3). At 109DPI, multiple combinations of co-occurring mutations were observed. In particular, immune escape mutation 1406KLVAMG(I\L)NAV1415 was identified co-occurring with H1115S/G/T and V1112A/I/T. Variants carrying I1412L and H1115S had a relative fitness of at least one third of the T/F virus. The combination of 1406KLVAMG(I\L)NAV1415, H1115S and V1112A reached a fitness level nearly equal to the T/F variant (relative fitness of 0.91, Table 3). This suggested a 90% restoration of fitness compared to the T/F virus when H1115S and V1112A were combined with the immune escape mutation. Notably, a variant with only 1406KLVAMG(I\L)NAV1415 and H1115S exhibited positive epistasis, with a relative fitness of 0.384.
Viral variants with higher fitness than the T/F virus of subject THDS1086MX were observed at 198DPI (Figure 4, Table 3). One variant carried the epitope mutation 1406KLVAMG(I\L)NAV1415 with V1112I, H1115S and an additional mutation A1593V, achieving a relative fitness of 2.14 at a frequency of 12.6%. Another variant with only 1406KLVAMG(I\L)NAV1415 V1112I and H1115S, occurring at 73.6%, showed a relative fitness of 0.360. Additionally, a new immune escape mutation, A1409L, co-occurred with V1112P and H1115S at a frequency of 4.70% and exhibited a relative fitness greater than the T/F (relative fitness = 1.68). The combination of 1406KLV(A\T)MGINAV1415, V1112P and H1115S did not co-occur with 1406KLVAMG(I\L)NAV1415, V1112I and H1115S, indicating a diversifying strategy for host adaptation. Overall, the combination of I1412L, V1112I and H1115S emerged as the most advantageous mutation combination, enhancing virus fitness, and reaching fixation in the viral population over the course of infection.
For THGS0684MX, co-occurring mutations in the NS3 region occurred in a much simpler fashion (Table 4), where the immune escape mutation 1325SILGIGT(A\V)L1333 achieved a relative fitness of 1.042 when co-occurring with P1496S. The effect of the synergistic mutations was also reflected in the fact that this variant reached 90.9% frequency of occurrence in the viral population at the last sequenced time point of 184DPI (Figure 4, Table 4).
To determine if the increase in estimated fitness was associated with reversion events, namely mutations towards common circulating variants that are assumed to be a more fit HCV strain (generated as previously described (15)). Comparison of the non-synonymous mutations at the final sequence time point in both subjects THDS1086MX and THGS0684MX to the global Genbank genotype 1a consensus sequence were performed. This revealed that V1112P and A1593V (12.6% frequency) in subject THDS1086MX were both reversions toward the worldwide consensus (Figure 4, Table 3). Similarly, for subject THGS0684MX, the mutation A1332Y (90.9% frequency) also reverted to the global consensus strain (Figure 4, Table 4).
These findings suggest rapid adaptation of chronic HCV variants post-strong CD8+ T-cell responses during acute infection, supported by positive epistasis via compensatory mutations post-second genetic bottleneck. Variations between subjects with identical T/F viruses imply a unique fitness restoration strategy for each individual rather than a universal mutation pattern across subjects sharing the same T/F variant.
In chronic progressors, nAbs emerge after CD8+ T-cells and coincide with the rebound of viral fitness
Previously, we have shown that an early nAb response is associated with clearance (11). In those that develop chronic infection, nAb activity is truly delayed and develops towards longitudinal variants after the T/F is cleared (11). Furthermore, we have shown previously (15, 16) that CD8+ T-cell responses impose a strong selective force on the T/F virus population, thus contributing to its extinction and to the rise of the escape variants with the establishment of chronic infection. Here, a comprehensive analysis integrating nAb responses, CD8+ T-cell responses, and viral fitness was performed to elucidate the dynamic interplay between the adaptive immune system and viral evolution and fitness.
All six chronic subjects included in this study had data for nAb responses (11), IFN-γ ELISPOT responses and viral fitness readily available. A subset of three clearer subjects had data for nAb responses (11) and IFN-γ ELISPOT responses (Figure 5 and Figure S3). As mentioned above, viral fitness could not be estimated for clearer subjects due to a lack of non-synonymous fixation mutations occurring within epitope regions.
The timing of the first nAb responses and the IFN-γ ELISPOT response were assessed to elucidate the dynamics of both CD8+ T-cell and nAb responses and their role in clearance or chronic outcomes. In the subset of three clearer subjects (those tested for both nAb responses (11) and IFN-γ ELISPOT responses) no significant difference was found in the timing of nAb and IFN-γ ELISPOT responses, with both responses emerging in all three subjects at <100 DPI (Figure 5A). However, for chronic progressors, IFN-γ ELISPOT responses were first detectable at an average of 48 DPI (range 30 – 80 DPI) whereas nAb responses occurred significantly later, with an average onset of 465 DPI (range 74 – 945 DPI), indicating a notable delay compared to CD8+ T-cell responses (p = 0.0313, Wilcoxon matched-pairs signed rank test, Figure 5B).
Interestingly, in subjects THDS1086MX and THGS0684MX, we observed escape variants with higher relative fitness emerging concurrently with the onset of nAbs at 198 and 184 DPI, respectively (Figure 5E, F). Additionally, upon the appearance of nAbs, there was a decline in IFN-γ ELISPOT, while nAb responses remained stable. This suggests a transition from CD8+ T-cell dominant immune responses to nAb dominant immune responses. It’s noteworthy that a similar pattern was observed in subject 300023, where an increase in viral fitness coincided with a weak but detectable nAb responses towards the T/F virus at 74 DPI (Figure 5G). Nevertheless, nAbs were undetectable at the subsequent timepoint while the IFN-γ ELISPOT response dominated until 200 DPI when nAb responses reappeared and increased along with IFN-γ ELISPOT. This suggests that nAbs were ineffective in controlling the virus in subject 300023 during this period (<100DPI). For the other three subjects, no consistent pattern was observed (Figure S3). Specifically, for subjects HOKD0485FX and 300256, nAbs emerged well after viral decline, and there was a lack of CD8+ T-cell data at the positive nAb timepoints, while for subject 300240 both IFN-γ ELISPOT and nAb responses increased after 300DPI.
Together, these results show that an early and strong CD8+ T-cell response in the absence of nAbs imposes a strong selective force on the T/F virus population, enabling the virus to escape and establish chronic infection.
Discussion
Using a rare cohort of very recently infected individuals we report the dynamics of evolution of the viral quasispecies and the role of the host cytotoxic T-cell and nAb responses during the acute phase of primary HCV infection. We show that viral fitness decreases during the first 90DPI associated with the magnitude of CD8+ T-cell responses and the initial level of diversification. Thereafter, viral fitness rebounds in a complex pattern of evolution characterized by multiple sets of co-occurring mutations. Finally, within this cohort of very recently HCV infected individuals we show that an early and strong CD8+ T-cell response in the absence of nAbs imposes a strong selective force on the T/F virus population, enabling the virus to escape and establish chronic infection. Understanding these dynamics is crucial for developing effective vaccines for HCV.
We have previously demonstrated in HCV that a genetic bottleneck occurs in the viral population at around three months post-infection, where the T/F is replaced with new viral variants that dominate infection (10, 15). These new variants carry amino acid changes across the viral genome, including HLA-I restricted epitopes, highlighting that T-cell responses, in the absence of nAbs, play a significant role in driving immune escape. (10, 15, 16, 28, 29). Nevertheless, the majority of studies on HCV CD8+ T-cell responses to date focus only on the epitope and its escape variant without analyzing viral fitness or surrounding mutations which co-occur with the immune escape mutation (30). To understand this process, a quantitative analysis of viral fitness relative to viral haplotypes was conducted using longitudinal samples. We observed a decrease in population average relative fitness in the period of <90DPI with respect to the T/F virus in chronic subjects. The decrease in fitness correlated positively with IFN-γ ELISPOT responses and negatively with SE indicating that CD8+ T-cell responses drove the rapid emergence of immune escape variants, which initially reduced viral fitness.
While a recent study examined mutations in the E2 region, revealing that certain co-occurring mutations led to a loss of E2 function in clearers but a gain in function for chronic progressors (31), the impact of co-occurring mutations in the nonstructural regions on HCV outcomes remains largely unexplored (25, 30, 32, 33). Furthermore, previous studies on HCV have not integrated CD8+ T-cell responses, longitudinal tracking of immune escape mutations, co-occurring mutations, and fitness estimation to analyze HCV evolutionary mechanisms (34). Combining these approaches provides a clearer understanding of HCV evolution by showing how co-occurring mutations impact viral survival over time. We observed that in some subjects, combinations of mutations could compensate for the negative fitness cost of immune escape, leading to a rebound in viral fitness during infection. This was not observed in all subjects and indicates that each subject’s mutation strategy is unique, even when there are shared HLA alleles (such as for subjects THDS1086MX and THGS0684MX).
When analysis of nAb responses, CD8+ T-cell responses, and viral fitness was performed, it was interesting to note the opposite trend with regards to the humoral response in these same subjects. While it remains unclear how T- and B-cell components of the immune system might co-operate to confer protection, the findings of the current study strongly suggest that the absence of nAbs in chronic progressors enables viruses to continue to infect new cells, replicate and mutate. This suggests a synergistic interplay between the B and T-cell arms of the adaptive response targeting the virus and the importance of co-occurrence of these responses. Several studies have alluded to this (11, 15, 16) yet to date no study has directly compared B and T-cell responses longitudinally in very recently infected clearers and chronic progressors. Furthermore, in chronic subjects where relative fitness increased after the genetic bottleneck, this increase coincided with the onset of nAbs (11, 31). The fitness model in this study serves as a surrogate for viral fitness, starting with the T/F virus as the baseline and calculating the fitness of subsequent variants relative to this initial measure, enabling us to track fitness changes throughout the infection. Additionally, in these subjects, the appearance of nAbs was associated with a decline in IFN-γ ELISPOT responses, while nAb responses remained stable, suggesting a shift from CD8+ T-cell responses to nAb responses.
In conclusion, this work provides initial insights into how co-evolving mutations contribute to the evolutionary dynamics observed during HCV infection. Furthermore, we show that an early and strong CD8+ T-cell response in the absence of nAbs imposes a strong selective force on the T/F virus population, enabling the virus to escape and establish chronic infection. This could explain the lack of association with a lower incidence of chronic HCV infection compared to placebo observed in the recent Phase I/II trials of the ChAd3-NSmut and MVA-NSmut vaccines, which aimed to induce T cell responses by encoding HCV NS proteins (35). While much work is still required to be able to fully understand the factors contributing to the overall dynamics of HCV infections, this work has the potential to inform design of the next generation of vaccines.
Methods
Subjects and samples
Samples were available from two prospective cohorts of HCV-seronegative and aviremic, high-risk, individuals enrolled in the Hepatitis C Incidence and Transmission Studies (HITS) in prisons (HITS-p), or in the general community (HITS-c), as previously described (11, 27, 36, 37). Risk behavior data and blood samples were collected every six months to screen for HCV RNA positivity and seroconversion. Participants identified as early incident cases were enrolled into a sub-cohort called HITS-incident (HITS-i) upon detection of new onset viremia without antibodies, as described previously (11). These subjects were then frequently sampled for 12 weeks before being offered antiviral treatment at 24 weeks, allowing for the classification of outcomes as either natural clearance or chronic infection. The date of infection was estimated by subtracting the average HCV pre-seroconversion window period, which has been estimated at 51 days from the midpoint between last seronegative and first seropositive time points as previously described (10, 11, 38, 39). Molecular HLA typing was performed at the Institute of Immunology and Infectious Diseases in Perth, Australia, using second-generation sequencing of HLA-A/B/C genes as previously described (40). 14 subjects with stored peripheral blood mononuclear cells (PBMCs) were selected from the HITS-i cohort. This group included 8 subjects with acute HCV infection and 6 subjects who developed chronic infection.
PBMCs were isolated from peripheral blood via density-based centrifugation, and resuspended in RPMI (Sigma-Aldrich, MO) supplemented with penicillin, streptomycin, l-glutamine, and fetal bovine serum at 1 × 106 cells/mL.
Viral sequencing and identification of T/Fs and longitudinal variants
Near full length HCV genome amplification and sequencing datasets of the full-length viral genomes at multiple timepoint have been published previously (10, 11, 15, 16, 41, 42). Briefly, near full length HCV genome amplification was performed using an nRT-PCR (41). Illumina (MiSeq Benchtop sequence, San Diego, USA) sequencing were performed on amplicons from longitudinal time points of the 14 subjects (11, 16, 41). A bioinformatics pipeline was used to clean and align reads generated from next generation sequencing (NGS) (10, 11, 15). To accurately detect single nucleotide polymorphisms (SNPs) from the aligned and cleaned sequences, analysis was performed to correct for random technical errors using the software ShoRAH, LoFreq and Geneious (11, 43, 44). Haplotypes were reconstructed from NGS data across the full genome using ShoRAH and QuasiRecomb (43, 45). As described previously(10, 11, 15, 16), to determine T/Fs, a statistical model, Poisson fitter, as well as phylogenetic analysis was applied to haplotypes from the first available viremic timepoint but prior to the first HCV antibodies (seroconversion) detection, for each subject to determine the T/F (46). Shannon Entropy was determined from the NGS data as previously described (10).
CD8+ T-cell epitope selection
For each of the 14 subjects, non-synonymous fixation events (>70% frequency in the viral population) from the last time point sequenced were substituted into the/TF sequence, thus generating a modified sequence termed the fixation sequence.
Identification of MHC-Class I restricted epitopes (9 to 10 amino acids) was performed by analysis of both HCV/TF and fixation sequences in conjunction with MHC Class I genotyping to predict epitopes and then compared with previous experimentally confirmed epitopes from the Immune Epitope Data Base (IEDB [www.iedb.org]). This resulted in each prediction list containing at least 10,000 epitope predictions.
Due to the large number of epitopes and the lack of available PBMCs, a bioinformatics pipeline, iedb_tool, was developed in Python 2.7 to identify a subset of epitopes for further experimental and bioinformatics analyses. This software package is freely available and can be downloaded at: https://github.com/PrestonLeung/IEDBTool2. The predicted epitopes generated from IEDB were downloaded in plain text format and parsed into iedb_predictionParser.py with options -lower 100.0 to extract relevant subject information (HLA genotypes, predicted epitope sequence, start and end positions of the epitope sequence). The predicted data set was then used with the experimentally validated epitopes from IEDB. This step generated a categorized list of ranked epitopes for which predicted epitopes and HLA-I typing for each subject were matched with experimentally validated epitopes. An 80% homology threshold was set to facilitate small differences between the autologous epitope sequence and those that were previously reported from the IEDB database.
We prioritized epitopes that matched the HLA allele of the subject and classified these into three categories. Category 1 included epitopes for each subject from which the mutated variant was no longer predicted to be recognized, indicating a potential escape variant. Category 2 included epitopes from those that underwent escape, but the variant was still predicted to be recognized by CD8+ T-cells and category 3 included epitopes that did not mutate during infection. A threshold of 10th percentile ranking or higher was used as a cut off for epitope selection. For categories 2 and 3, we selected a maximum of 10 epitopes due to limitations in PBMC samples. An upper limit of 100 epitopes was set for each subject to account for the total PBMCs available for each time point in the ELISpot matrix testing approach (see below). If the above criteria gave a total number of epitopes less than 40, then epitopes from categories 2 and 3 were searched with a lowered threshold of 50th percentile ranking or higher to increase the sample size to at least 40. This refined list of ranked epitopes we refer to throughout the manuscript as “potential epitopes”.
IFN-γ ELISPOT validation
Potential epitopes generated above were analyzed using matrix IFN-γ ELISPOT assays as previously described (15, 16). Briefly, peptides were synthesized by Mimotopes Australia and used in autologous CD8+ T-cell ELISpot assays in pools of ≤ 5 peptides per well in a matrix format. The peptides were pooled such that any two wells would only have one common peptide and each peptide was tested in at least two separate wells.
As previously described by us (15, 16), PBMC were added to a 96-well ELISpot plate (MAIPS; Millipore, USA) precoated with gamma interferon (IFN-γ) (Mabtech, Sweden) at a concentration of 150,000 cells per well and incubated overnight with peptides (final concentration of 10µg/ml). Plates were read and analyzed using an AID plate reader (Autoimmun Diagnostika GmbH, Germany). A peptide/well was concluded to be a positive if it demonstrated an interferon-γ (IFN-γ) response ≥ 25 spot-forming units per million cells (SFU/million) and a negative control count of zero. A confirmation experiment using the IFN-γ ELISpot assay was conducted for individual positive peptides using one peptide per well in a concentration of 200,000 cells per well. At least one well per study subject was allocated as a positive control (anti-CD3 antibody; Mabtech, Sweden), and three wells were used as negative controls. Background was defined as the mean plus 3 standard deviations of the number of spots counted across the three negative controls.
Estimating the rate of CD8+ T-cell epitope escape
The estimation of the rate of CD8+ T-cell epitope escape was performed as previously described(15, 16). Briefly, the kinetics of CD8+ T-cell escape was estimated with a simple population dynamics model (15, 16, 47). The model predicts that the frequency of the escape variant f(t) is: f(t)= f0/f0+(1−f0)e−kt, where k is the rate of escape and it is assumed that the escape variant is present at a low frequency beyond the range of detection during the initial phase of time (t) at zero, and its frequency is given by f0. The escape variant was defined as any observed mutations away from the wild-type sequence that consequently becomes fixed in the population. Therefore, the frequency of the escape variant at individual longitudinal time point was the total frequencies of each epitope carrying the escape mutation regardless of the other amino acid positions. In some cases where the escape variant was not observed in the earlier time points, an estimate of 1/(n+1) replaces a frequency of 0, where n is the average coverage of the corresponding time point from the deep sequencing data. This model was used to estimate the rate of escape only in those cases with experimental evidence (ELISpot data generated in this study) of CD8+ T-cell-driven immune escape. Estimates were performed in R using non-linear least-squares approach for non-linear models (48).
Estimating survival fitness of viral variants
Estimates of the viral fitness were performed on each viral variant reconstructed by QuasiRecomb as previously described (22–24). Specifically, haplotype reconstruction was performed longitudinally for each subject in a region of approximately 600 amino acids surrounding the experimentally confirmed CD8+ T-cell epitopes. The fitness landscape inference is obtained by estimating the parameter of a function representing the energy of the viral population and with :
where is the peptide sequence of length m such that describes individual amino acids in is the prevalence fitness taken under the assumption that the frequently occurring variants represent the fitter strain (i.e. the common circulating virus, in this case the original T/F virus), and this mathematically defined fitness correlate positively to the actual viral fitness (22). is the ‘energy’ of peptide a pseudo measurement for fitness and z is a normalization factor described in the following equation.
The parameter hi(zi) is the contribution of energy of the single amino acid at position i of specifies the energy contribution associated with pairwise interactions between two amino acids in at positions i and j. Under the model assumptions, the association between measurement of energy and fitness estimate showed a negative correlation where minimal maximizes the logarithm of the viral fitness (24).
A relative fitness estimate was also calculated for reconstructed haplotypes by normalizing individual fitness estimates with respect to the fitness of the haplotype corresponding to the T/F variant. These values were used to represent the viral fitness of individual haplotypes to infer viral evolution dynamics with respect to the T/F virus.
This analysis was used to quantify the selection exerted by the host cytotoxic T cell response in driving the evolution of the T/F virus. The analysis also showed the role of specific co-evolving mutations and the identification of epistatic interactions that may determine the evolution of viral fitness and the overall success of the infection.
HCVpp production, infection and neutralization
As previously described by us, E1E2 glycoproteins were cloned and co-transfected with MLV gag/pol and luciferase vectors in 293T-cells to produce HCV-pseudo-particles (pp)(11, 49, 50). Briefly, neutralization assays were performed by incubating HCVpp with heat inactivated plasma for one hour at 37 °C. This mixture was applied directly to Huh-7.5 hepatoma cells (Apath, L.L.C, New York, NY, USA), and incubated for 72 hours after which luciferase activity was measured (11). Neutralization of HCVpp was calculated using the formula: % inhibition = 1 − (inhibited activity)/(‘normal’ activity)] × 100, after subtraction of negative control (pseudo-particle generated without glycoproteins) RLU, where normal activity is HCVpp incubated with plasma from a healthy donor. The 50% ID50 titer was calculated as the nAb concentration that caused a 50% reduction in RLU for each plasma/HCVpp combination tested in neutralization. All samples were also tested for neutralizing activity on control pseudo-particle VSV-G, to determine that nAbs were HCV E1E2 specific. All data were fitted using non-linear regression plots (GraphPad, Prism).
Data visualization and statistical analysis
Data visualization for CD8+ T-cell analyses were performed using ggplot2 package (51) in R (48). Visualization of viral fitness estimates and longitudinal neutralization and CD8+ T-cell data was performed using GraphPad Prism version 10.0 for Windows, GraphPad Software, La Jolla California USA (www.graphpad.com). As were comparisons of timing of nAb and CD8+ T-cell responses where Wilcoxon matched-pairs signed rank tests were performed. Sequence alignments for highlighter plots were performed on the variants generated here using Geneious Prime 2023 (https://www.geneious.com).
Ethics statement
Human research ethics approvals were obtained from Human Research Ethics Committees of Justice Health (reference number GEN 31/05), New South Wales Department of Corrective Services (05/0884), and the University of New South Wales (05094, 08081), all located in Sydney, Australia. Written informed consent was obtained from the participants. All methods were performed in accordance with the relevant guidelines and regulations.
Data availability
Single-cell RNA seq data are available on Accession number GSE196330. All the single-cell RNA-seq data are deposited with GSE196330.
Acknowledgements
We would like to acknowledge Gregory R. Hart and Prof. Andrew L. Ferguson for their assistance with the fitness models.
Supporting information
References
- 1.Hepatitis C: global epidemiology and strategies for controlClin Microbiol Infect 22:833–8
- 2.Natural history of chronic hepatitis CHepatology 36:S35–46
- 3.The global burden of hepatitis CLiver Int 29:74–81
- 4.Spontaneous viral clearance following acute hepatitis C infection: a systematic review of longitudinal studiesJ Viral Hepat 13:34–41
- 5.Acute hepatitis C: current status and remaining challengesJournal of hepatology 49:625–33
- 6.Designing a B Cell-Based Vaccine against a Highly Variable Hepatitis C VirusFront Microbiol 8
- 7.Expanded classification of hepatitis C virus into 7 genotypes and 67 subtypes: updated criteria and genotype assignment web resourceHepatology 59:318–27
- 8.Identification of 19 Novel Hepatitis C Virus Subtypes-Further Expanding HCV ClassificationOpen Forum Infect Dis 6
- 9.Single-Genome Sequencing of Hepatitis C Virus in Donor-Recipient Pairs Distinguishes Modes and Models of Virus Transmission and Early DiversificationJournal of virology 90:152–66
- 10.Sequential bottlenecks drive viral evolution in early acute hepatitis C virus infectionPLoS Pathog 7
- 11.Clearance of hepatitis C virus is associated with early and potent but narrowly-directed, Envelope-specific antibodiesSci Rep. 9
- 12.Selection pressure from neutralizing antibodies drives sequence evolution during acute infection with hepatitis C virusGastroenterology 136:2377–86
- 13.Characterization of host-range and cell entry properties of the major genotypes and subtypes of hepatitis C virusHepatology 41:265–74
- 14.Clearance of hepatitis C infection is associated with the early appearance of broad neutralizing antibody responsesHepatology 59:2140–51
- 15.Transmitted/Founder Viruses Rapidly Escape from CD8+ T Cell Responses in Acute Hepatitis C Virus InfectionJ Virol 89:5478–90
- 16.Identification of human progenitors of exhausted CD8(+) T cells associated with elevated IFN-gamma response in early phase of viral infectionNat Commun 13
- 17.The distribution of fitness effects caused by single-nucleotide substitutions in an RNA virusProc Natl Acad Sci U S A 101:8396–401
- 18.Infecting HIV-1 Subtype Predicts Disease Progression in Women of Sub-Saharan AfricaEBioMedicine 13:305–14
- 19.HCV E1 influences the fitness landscape of E2 and may enhance escape from E2-specific antibodiesVirus Evol 9
- 20.Identifying immunologically-vulnerable regions of the HCV E2 glycoprotein and broadly neutralizing antibodies that target themNat Commun 10
- 21.Evolutionary modeling reveals enhanced mutational flexibility of HCV subtype 1b compared with 1aiScience 25
- 22.Translating HIV sequences into quantitative fitness landscapes predicts viral vulnerabilities for rational immunogen designImmunity 38:606–17
- 23.Relative rate and location of intra-host HIV evolution to evade cellular immunity are predictableNat Commun 7
- 24.Empirical fitness models for hepatitis C virus immunogen designPhys Biol 12
- 25.Coordinated evolution of the hepatitis C virusProc Natl Acad Sci U S A 105:9685–90
- 26.Compensatory mutations restore the replication defects caused by cytotoxic T lymphocyte escape mutations in hepatitis C virus polymeraseJ Virol 85:11883–90
- 27.Incident Hepatitis C Virus Genotype Distribution and Multiple Infection in Australian PrisonsJ Clin Microbiol 54:1855–61
- 28.Viral sequence evolution in acute hepatitis C virus infectionJ Virol 81:11658–68
- 29.Viral escape and T cell exhaustion in hepatitis C virus infection analysed using Class I peptide tetramersImmunol Lett 85:165–71
- 30.Next-generation sequencing reveals large connected networks of intra-host HCV variantsBMC Genomics 15
- 31.Neutralizing antibodies evolve to exploit vulnerable sites in the HCV envelope glycoprotein E2 and mediate spontaneous clearance of infectionImmunity 57:40–51
- 32.Genome-wide hepatitis C virus amino acid covariance networks can predict response to antiviral therapy in humansJ Clin Invest 119:225–36
- 33.Genotype 1 hepatitis C virus envelope features that determine antiviral response assessed through optimal covariance networksPLoS One 8
- 34.A bioinformatics pipeline for the analyses of viral escape dynamics and host immune responses during an infectionBiomed Res Int 2014
- 35.Randomized Trial of a Vaccine Regimen to Prevent Chronic HCV InfectionN Engl J Med 384:541–9
- 36.Ongoing incident hepatitis C virus infection among people with a history of injecting drug use in an Australian prison setting, 2005-2014: The HITS-p studyJ Viral Hepat. 24:733–41
- 37.Opioid substitution therapy protects against hepatitis C virus acquisition in people who inject drugs: the HITS-c studyMed J Aust 201:326–9
- 38.Testing strategy to identify cases of acute hepatitis C virus (HCV) infection and to project HCV incidence ratesJ Clin Microbiol 46:499–506
- 39.Dynamics of viremia in early hepatitis C virus infectionTransfusion 45:994–1002
- 40.Evidence of viral adaptation to HLA class I-restricted immune pressure in chronic hepatitis C virus infectionJ Virol 80:11094–104
- 41.A method for near full-length amplification and sequencing for six hepatitis C virus genotypesBMC Genomics 17
- 42.Limited naturally occurring escape in broadly neutralizing antibody epitopes in hepatitis C glycoprotein E2 and constrained sequence usage in acute infectionInfect Genet Evol 49:88–96
- 43.ShoRAH: estimating the genetic diversity of a mixed sample from next-generation sequencing dataBMC Bioinformatics 12
- 44.LoFreq: a sequence-quality aware, ultra-sensitive variant caller for uncovering cell-population heterogeneity from high-throughput sequencing datasetsNucleic Acids Res 40:11189–201
- 45.Probabilistic inference of viral quasispecies subject to recombinationJ Comput Biol 20:113–23
- 46.Genetic identity, biological phenotype, and evolutionary pathways of transmitted/founder viruses in acute and early HIV-1 infectionJ Exp Med 206:1273–89
- 47.Inefficient cytotoxic T lymphocyte-mediated killing of HIV-1-infected cells in vivoPLoS Biol 4
- 48.R: a language and environment for statistical computingVienna, Austria: R Foundation for Statistical Computing
- 49.Envelope-specific IgG3 and IgG1 responses are associated with clearance of acute hepatitis C virus infectionViruses 12
- 50.Clearance of hepatitis C virus is associated with early and potent but narrowly-directed, Envelope-specific antibodiesScientific reports 9
- 51.ggplot2: Elegant Graphics for Data AnalysisNew York: Springer-Verlag
Article and author information
Author information
Version history
- Sent for peer review:
- Preprint posted:
- Reviewed Preprint version 1:
Copyright
© 2024, Walker et al.
This article is distributed under the terms of the Creative Commons Attribution License, which permits unrestricted use and redistribution provided that the original author and source are credited.
Metrics
- views
- 85
- downloads
- 2
- citations
- 0
Views, downloads and citations are aggregated across all versions of this paper published by eLife.