Selection of HIV-1 for resistance to fifth-generation protease inhibitors reveals two independent pathways to high-level resistance

  1. Ean Spielvogel
  2. Sook-Kyung Lee
  3. Shuntai Zhou
  4. Gordon J Lockbaum
  5. Mina Henes
  6. Amy Sondgeroth
  7. Klajdi Kosovrasti
  8. Ellen A Nalivaika
  9. Akbar Ali
  10. Nese Kurt Yilmaz
  11. Celia A Schiffer  Is a corresponding author
  12. Ronald Swanstrom  Is a corresponding author
  1. Department of Microbiology and Immunology, University of North Carolina at Chapel Hill, United States
  2. Lineberger Comprehensive Cancer Center, University of North Carolina at Chapel Hill, United States
  3. Department of Biochemistry and Molecular Biotechnology, University of Massachusetts Medical School, United States
  4. Department of Biochemistry and Biophysics, University of North Carolina at Chapel Hill, United States

Abstract

Darunavir (DRV) is exceptional among potent HIV-1 protease inhibitors (PIs) in high drug concentrations that are achieved in vivo. Little is known about the de novo resistance pathway for DRV. We selected for resistance to high drug concentrations against 10 PIs and their structural precursor DRV. Mutations accumulated through two pathways (anchored by protease mutations I50V or I84V). Small changes in the inhibitor P1'-equivalent position led to preferential use of one pathway over the other. Changes in the inhibitor P2'-equivalent position determined differences in potency that were retained in the resistant viruses and that impacted the selected mutations. Viral variants from the two pathways showed differential selection of compensatory mutations in Gag cleavage sites. These results reveal the high level of selective pressure that is attainable with fifth-generation PIs and how features of the inhibitor affect both the resistance pathway and the residual potency in the face of resistance.

Editor's evaluation

This work provides a fundamental understanding of the evolution of resistance to Darunavir, an exceptionally potent HIV-1 protease inhibitor. Conclusions are supported by compelling evidence that small changes in the inhibitor lead to two resistance pathways, each anchored by a specific mutation. These results provide the first evidence for de novo pathway selection and provide an atomic basis for designing the next generation of HIV-1 protease inhibitors.

https://doi.org/10.7554/eLife.80328.sa0

Introduction

Highly active antiretroviral therapy against HIV-1 with combinations of drugs effectively block viral replication and preclude the evolution of drug resistance. Three major factors interplay to define the emergence of resistance in vivo: (i) the active drug concentration relative to its inhibitory activity; (ii) the level of resistance conferred by one or more mutations; and (iii) the fitness cost of the resistance mutations. For protease inhibitors (PIs), the number of mutations selected increases with increasing drug concentration (Watkins et al., 2003). Thus, the concentration of this drug class in vivo is an important parameter in determining the genetic barrier to resistance.

The early identification of the retroviral protease as a member of the aspartyl proteinase family and the determination of a number of cleavage site sequences led to the development of first-generation PIs (PIs) that validated the HIV-1 protease as a drug target (Katoh et al., 1987; Richards et al., 1989; Seelmeier et al., 1988). A second generation of PIs was quickly developed for use in humans, becoming the third drug in a three-drug regimen that achieved sustained suppression of viral load with no evolution of resistance (Gulick et al., 1997). The third generation of PIs had improved properties with regard to side effects and efficacy. These properties have been further enhanced with a fourth-generation PI, darunavir (DRV), which achieves a drug level in plasma (>1 µM) that is 1000-fold greater than its inhibitory activity in cell culture (Ali et al., 2010; Kurt Yilmaz et al., 2009; Nalam et al., 2013). The potential efficacy of a fourth-generation PI such as DRV has led to attempts to use this drug in monotherapy (Antinori et al., 2015; Arribas et al., 2012; Paton et al., 2015; Valantin et al., 2012). Although the initial trials of monotherapy showed a modest increase in the loss of viral suppression relative to standard ART, in the cases of virological failure there was no significant resistance to DRV detected in the rebound virus (Antinori et al., 2015; Arribas et al., 2012; Paton et al., 2015; Valantin et al., 2012). Thus, the observed virological failure is most easily attributed to issues of low drug exposure in specific participants. While therapy has not yet moved to single agents, potent two drug therapies are being explored. In this regard, several studies have shown that a PI can be part of a successful two drug therapy regimen (Casado et al., 2020; Di Cristo et al., 2020; Hawkins et al., 2019; Huang et al., 2019; Vizcarra et al., 2020).

Drug resistance to DRV is incompletely understood. The appearance of DRV-resistant viruses during therapy failure has occurred in people who had previously experienced virological failure with other PIs (de Meyer et al., 2008; Descamps et al., 2009; Pellegrin et al., 2008; Tremblay, 2008). In these cases, the pathway to DRV resistance was likely influenced by the preexisting resistance to the previous PI. DRV resistance has been selected in cell culture (Aoki et al., 2018; de Meyer et al., 2008; Koh et al., 2010). High-level resistance was selected starting with either a mixture of highly resistant variants that were allowed to recombine or the selection was started with a single PI-resistant strain; in both cases, an I84V resistance pathway in the protease was observed. However, there are no reports of selection of high-level resistance starting with an unmutated ‘wildtype’ virus. Thus, there is little information on what a do novo pathway for resistance to DRV would be.

An important concept in HIV-1 PI design is to avoid chemical moieties that extend outside of the substrate envelope, the shared volume in the protease active site that is occupied by natural protease substrates (King et al., 2004; King et al., 2002). We have previously designed a series of highly potent PIs based on DRV, UMASS-1 through -10, that still fit within the substrate envelope (Nalam et al., 2013). The designed fifth-generation inhibitors have modified chemical moieties that relative to DRV further fill the substrate envelope at the equivalent of the P1' and P2' positions (Figure 1). All bind tightly to the wild-type HIV-1 protease with a Ki of less than 5 pM. These inhibitors retained robust binding to many multidrug-resistant protease variants and viral strains.

Darunavir (DRV) structure as a PR substrate analog.

The chemical structure of DRV is shown. Protease substrates are labeled following a convention where the scissile bond is flanked upstream by amino acids labeled P1, P2, etc., with the amino acids downstream labeled P1', P2', etc. The cleavage site can thus be written P1/P1'. The side chains of these amino acids extend into subsites in the protease with corresponding labels (i.e., the P1 side chain extends into the S1 subsite). In the UMASS series of analogs, the DRV structure at the equivalent of the P1' position (labeled R1) was modified as was the structure at the equivalent of the P2' position (labeled R2).

In an effort to define the de novo pathway to resistance for DRV and determine the effects of these chemical changes in the inhibitors, we examined the evolutionary path used to attain high-level resistance to DRV and the UMASS-1 through -10 panel of PIs. Under continuous and escalating selective pressure, representing between 50 and 95 passages, the virus evolved to accumulate multiple mutations. In most cases, selection was carried out until the final concentration approximated that achieved by DRV in vivo. Relatively minor modifications in inhibitor structure favored selection of one of two pathways to resistance, anchored by protease mutation I50V or I84V. In addition, the P2'-equivalent chemical structure of the inhibitor affected residual potency against highly resistant strains and in some cases changed the potency of the P1'-equivalent moiety. These results reveal the extremely high genetic barrier to resistance to the fourth-generation PI DRV at inhibitor concentrations that can be achieved in vivo, and the complex evolutionary pathways required to achieve resistance. In addition, these results reveal features of fifth-generation protease inhibitor design that both affect the selected pathway of resistance and determine residual potency against resistant variants.

Results

Panel of highly potent and analogous HIV-1 PIs

HIV-1 PIs were designed by modifications to DRV to increase favorable interactions with the protease within the substrate envelope, thereby increasing potency while minimizing evolution of resistance (Nalam et al., 2013). A panel of 10 DRV analogs were chosen with enzymatic inhibition constants (Ki) in the single or double-digit picomolar range to wild-type NL4-3 protease and the I84V and I50V/A71V drug-resistant variants, respectively (Table 1; Lockbaum et al., 2019; Mittal et al., 2013). These PIs contained modified P1’ positions with (S)-2-methylbutyl or 2-ethyl-n-butyl moieties (R1-1 and R1-2, respectively) in combination with five diverse P2′ phenyl-sulfonamides (R2-1 to R2-5), with the inhibitors named UMASS-1 through -10 (Table 1). These inhibitors and DRV were also tested in a cell culture-based viral inhibition assay. The EC50 values (the effective concentration needed at the time of virus production to reduce infectivity by 50%) for DRV and the UMASS analogues ranged from 2.4 to 9.1 nM, significantly more potent than the second- and third-generation PIs tested (Figure 2—figure supplement 1).

Table 1
Structures, Ki constants, and EC50 values of the UMASS series of HIV-1 protease inhibitors (PIs).
InhibitorStructureKi (pM)EC50 (nM)
WTI84VI50V/A71VWTViral culture (5000 nM)
DRV<5.0*25.6 ± 5.6*74.5 ± 5.6*7.7 ± 1.6>100,000
UMASS-1<5.0*26.1 ± 3.7*110.3 ± 8.8*5.9 ± 1.0
UMASS-2<5.0<5.015.0 ± 2.72.4 ± 0.3
UMASS-3<5.09.9 ± 2.779.9 ± 5.99.1 ± 1.014,800 ± 6800
UMASS-4<5.010.5 ± 1.832.9 ± 3.03.2 ± 0.413,700 ± 6600
UMASS-5<5.07.0 ± 1.77.8 ± 0.94.0 ± 0.5>100,000
UMASS-6<5.0*12.8 ± 3.1*100.0 ± 9.9*5.2 ± 0.8
UMASS-7<5.012.1 ± 4.518.2 ± 3.03.1 ± 0.5
UMASS-8<5.0<5.055.4 ± 4.04.2 ± 0.9
UMASS-9<5.07.6 ± 1.642.3 ± 2.66.4 ± 1.2>100,000
UMASS-10<5.014.3 ± 9.35.8 ± 1.14.1 ± 0.9
  1. *

    Previously reported in Lockbaum et al. (ACS Infect Dis. 2019 Feb 8; 5 (2): 316–325).

There are several patterns worth noting in the Ki and EC50 values. First, all inhibitors had a Ki of less than 5 pM (limit of detection) against the wild-type enzyme, precluding a comparison of potency among them without using partially resistant mutant proteases (I84V and I50V/A71V). Second, the P1'/R1 position change to the larger 2-ethyl-n-butyl moiety (R1-2, UMASS-6) increased potency against the I84V mutant protease relative to DRV and the intermediate sized R1-1 moiety UMASS-1 (all three share the same R2-1 moiety); this enhanced potency of the R1-2 inhibitors over the R1-1 inhibitors was also apparent for UMASS-8 over UMASS-3, which share the same R2-3 structure (Table 1, Figure 2—figure supplement 2), while this pattern of potency was reversed (i.e., R1-1 was more potent than R1-2) for UMASS pairs -2 and -7, and -5 and -10 (with R2-2 and R2-5, respectively; Figure 2—figure supplement 2). Third, four of the inhibitors with the R2 moieties R2-2 or R2-5 were significantly more potent (UMASS-2, -5, -7, -10) than with the other R2 moieties. This trend was most pronounced for activity against the I50V/A71V mutant enzyme in both the R1-1 and R1-2 backgrounds (Figure 2—figure supplement 2). The significant potency of the R2-2 and R2-5 containing inhibitors against the I50V/A71V mutant enzyme while maintaining good potency against the I84V enzyme indicates that both the R1 and R2 constituents contribute to potency, and to enhancing potency over DRV. This potency was also seen in inhibition of viral infectivity as the EC50 values of the R2-2 and R2-5 containing inhibitors were among the lowest measured (Table 1).

Selection for high-level resistance during passage in cell culture follows two pathways

To evaluate the potential of each inhibitor to select for mutations that would confer high-level resistance and to compare these mutations across the inhibitor series, we grew HIV-1 under conditions of escalating inhibitor concentration in cell culture through a lengthy period of selection (50–95 passages). The selection experiments were performed under two separate starting conditions: first when the starting virus was generated from the NL4-3 clone (this clone closely approximates the clade B consensus for the protease amino acid sequence and will be referred to as ‘wild type’), then again when the starting virus was a mixture of 26 isogenic viruses each with a single mutation associated with drug resistance in the NL4-3 background. Notably, in the latter case only about one-half of the mutations that appeared in the culture during selections were in the mixture of starting mutations, indicating that even in the selections that were seeded with the pool of single resistance mutations there was sufficient evolutionary capacity to explore mutations at positions that started as the wild-type sequence. The initial inhibitor concentration started at low nanomolar concentrations and increased by a factor of 1.5 with each subsequent viral passage (the inhibitor concentration was increased only after the virus spread efficiently through the culture). All of the selections starting with wild-type virus reached at least 5 µM of inhibitor concentration (approximating the therapeutic concentration reached by DRV in the blood and in the range of a 1000-fold increase in the starting EC50). For technical reasons, only five of the selections starting with the mixture of mutants reached an inhibitor concentration as high as 400 nM and are reported here (Figure 2—figure supplement 3).

Resistance mutations selected in the protease coding domain during the escalating selective pressure of increasing PI concentration were examined at various timepoints using a next-generation sequencing (NGS) protocol that included Primer ID with the MiSeq platform (Zhou et al., 2015). We first examined the sequence of the most abundant variant in each of the two selection schemes present at the highest inhibitor concentration reached (Figure 2A). For those selections that reached 5 μM, 8–14 mutations were present in the most abundant variant. Typically the next two most abundant variants in the population differed by a single amino acid, and on average the three most abundant variants in the 5 μM cultures accounted for 88% of the total viral population (Figure 2—source data 1). Evolution of resistance followed two pathways, one based on I50V and one based on I84V. In three cultures, V2wt, V5wt, and V2mut (which reached either 4 or 5 μM final drug concentration; V2wt indicates the virus pool selected with UMASS-2 starting with wild-type virus), these two mutations, I84V and I50V, became linked on the same genome (Figure 2A) showing they are not mutually exclusive; it is worth noting that the appearance of the linked I84V/I50V mutations occurred in the selections of the UMASS inhibitors (-2 and -5) that showed the greatest potency against the two mutant enzymes I84V and I50V/A71V (Table 1, Figure 2—figure supplement 2), consistent with inhibitor potency driving the co-selection of these two primary resistance mutations together.

Figure 2 with 4 supplements see all
Features of viruses selected to high-level resistance against darunavir (DRV) and the UMASS-1 through -10 inhibitor series.

Infected cultures were put under drug selection starting with either wild-type virus or a pool of mutant viruses. (A) Most abundant variants present at final timepoint. Resistance mutations in the viruses selected starting with the wild-type virus are in blue, and the selection starting with the pool of mutant viruses in in red. The culture name indicates a virus pool (V), the UMASS PI number (V1), and whether the selection started with wild-type or mutants (V1wt). PR sequences that were used to generate purified recombinant protease are indicated (e.g., PR-3wt). The final protease inhibitor (PI) concentration reached and the R1 and R2 moieties are indicated along with the PI. (B) Number of mutations present in the three most abundant variants of each selection at two different PI concentrations. Cultures containing I84V resistance mutation are in black (n = 4), I50V-containing cultures are in white (n = 7), and cultures with I50V+I84V linked are shown with hatched box (n = 3). Unpaired t-test was used to compare the number of mutations in the 84V vs. the 50V cultures. (C) EC50 values for highlighted selections in panel (A) reaching 100 nM (square) and 5000 nM (triangle). (D) Relative infectivity values for the same selections shown relative to the wild-type virus (circle). Relative infectivity was measured with normalized input amounts of HIV-1 p24 CA protein in replicates (n=3). Error bars show range of values. (E) Enzyme inhibition constants (Ki) of end point PR variants versus catalytic efficiency (kcat/Km). Open circle represents wild-type NL4-3. Trendline is for visualization purposes only. * p<0.05, **p<0.01, ***p<0.001.

Figure 2—source data 1

Source sequence data for top abundant variant in panel A.

https://cdn.elifesciences.org/articles/80328/elife-80328-fig2-data1-v3.xlsx
Figure 2—source data 2

Source sequence data for top 3 most abundant variants in panel A.

https://cdn.elifesciences.org/articles/80328/elife-80328-fig2-data2-v3.xlsx

In addition to the sequence of the terminally selected virus, we validated that selection occurred over the entire course of the selection protocol. First, as can be seen in Figure 2B, the number of mutations present at 100 nM drug concentration was approximately half the number present after selection to greater than 1 μM. In comparing the number of mutations present in cultures using the I84V versus the I50V pathway, we noted that on average there were more mutations in the I50V viruses at the intermediate drug concentration compared to the I84V viruses (100 nM, p=0.03), and this trend continued at the higher drug concentration. Next we compared the EC50 of five virus cultures representing the I84V or the I50V pathways (highlighted in yellow in Figure 2), and including one that had both mutations linked, testing the level of resistance to all of the UMASS-1 through -10 inhibitors after selection to 100 nM drug concentration or to 5 μM. As can be seen in Figure 2C, resistance (measured as an increase in the EC50) was apparent after selection to the 100 nM inhibitor concentration, and in each case the EC50 increased an average of 100-fold after selection to the final inhibitor concentration of 5 μM (EC50 values of the final virus pool are shown in Table 1).

We were also interested in the fitness cost of these mutations. To measure this effect, we compared the infectivity of the virus in 10 culture supernatants (five virus pools at each of two levels of selection), normalizing infectivity on a reporter cell line for a given amount of the virion p24 CA protein in the culture supernatant. This was compared to the unmutated NL4-3 parent (given a value of 100%). As can be seen in Figure 2D, selection to 100 nM inhibitor concentration resulted in a reduction in fitness/relative infectivity by between 100- and 1000-fold. Those cultures with the biggest decrease in fitness rebounded to around 1% relative infectivity after selection to an inhibitor level of 5 μM. This suggests that these high levels of resistance are linked to maintaining a residual level of relative infectivity around 1% (as defined under these culture conditions).

We used another approach to explore the interplay between resistance and fitness; we examined the effect of resistance on catalytic efficiency of the protease by measuring Ki and the catalytic efficiency (kcat/Km) for a set of mutant proteases against the entire panel of UMASS inhibitors and DRV. The average Ki value of each mutant protease (sequences shown in Figure 2—source data 1) is plotted against the catalytic efficiency (kcat/Km) for that enzyme in Figure 2E. As can be seen, there is a strong relationship between higher Ki values (i.e., resistance) and dramatic reductions in catalytic efficiency, consistent with the loss of fitness associated with these viral populations.

Features of the selection process assessed by NGS of longitudinal samples

Each of the viral cultures started with wild-type virus showed an accumulation of protease mutations with increasing selective pressure. NGS analysis revealed very few fixed variants in the cultures started with wild-type virus until the inhibitor concentration reached approximately 3–5 nM (approaching the EC50 value for the wild-type virus, Table 1); in contrast, the cultures that were started with the mutant library selected for the outgrowth of a subset of those mutants by 1 nM inhibitor concentration (Figure 3A). In all cultures, the transition through the EC50 concentration of the wild-type virus provided significant selective pressure for fixing mutations. Multiple unfixed mutations were observed in each culture after the drug concentration exceeded the EC50 values above 3 nM, highlighting the high genetic diversity in the culture. Additional mutations became linked on each viral genome at higher drug concentrations.

Figure 3 with 2 supplements see all
Evaluation of sequence diversity in the viral population during increasing selective pressure with protease inhibitor (PI) inhibitors.

(A) Shannon entropy was calculated using sequence diversity in the protease region. Abundant mutations are shown and fixed mutations are underlined. Selections from wild type are in orange (n = 6) and from the mutant pool (n = 11) in blue. Two selections from wild-type virus are shown in orange (UMASS-9) and red (UMASS-10). (B) Compiled Shannon entropy values from selections starting with wild-type virus that reached concentrations of >1000 nM. (C) Compiled Shannon entropy values from selections starting with the pool of mutant viruses that reached concentrations of at least 400 nM. (D) Selections starting with the pool of 26 mutants showed higher entropy than the pool starting with WT (n=10, n=10) when averaged over all analyzed drug concentrations (p<0.001 using the unpaired t-test). Error bars show standard deviation.

Deep sequencing revealed that mutations accumulated in complex patterns. We assessed the sequence complexity of each culture by calculating the Shannon entropy to allow comparison of changes in the diversity in each culture over time and as a function of increasing selective pressure (Figure 3A). In the cultures that showed the early appearance of the I84V mutation, this was associated with a peak in entropy, reflecting high genetic diversity, followed by a decrease in entropy when the I84V mutation became fixed (Figure 3—figure supplement 1A). The introduction of the I50V mutation was generally not associated with a drop in entropy, rather these populations maintained high genetic diversity even at higher drug concentrations (Figure 3—figure supplement 1B). We interpret these patterns as indicative of I84V conferring some level of resistance without a dramatic loss in fitness, allowing a more homogeneous culture (i.e., less entropy). In contrast, I50V may confer a higher level of resistance but at a greater fitness cost, thus supporting greater diversity in the culture either as compensatory mutations or as other combinations of mutations with lesser resistance but higher fitness. In this regard, we previously showed I50V significantly reduces the fitness of the virus relative to the fitness loss of a virus with I84V as single mutations (Henderson et al., 2012).

When we examined the sequence diversity as assessed by Shannon entropy for all selections that reached at least 1 µM in inhibitor concentration, we found that cultures starting with the mixture of resistant viruses averaged a nearly twofold higher entropy value compared to the cultures where the selection started with just the virus generated from the NL4-3/wild-type clone (3.0 vs. 1.6, p<0.0001 Mann–Whitney test; Figure 3B–D). This was unexpected as both sets of selections passed through many genetic bottlenecks. This result is most easily explained if the rates of recombination were fairly high throughout the culture period so that most sequence variants were maintained at least at a low level. However, at the end of the selection period no wild-type variants could be detected in the cultures by deep sequencing.

The chemical nature of R1 and R2 determines the resistance pathway

We next examined whether the inhibitor structure influenced the resistance pathway chosen. We found that the R1 group, that is, the (S)-2-methylbutyl (R1-1) or the 2-ethyl-n-butyl (R1-2), largely defined the resistance pathway observed; for these inhibitors, the R1 group takes the position of P1' in the protease substrate analog, occupying the S1' subsite. With the UMASS-1 through -5 series (the smaller R1-1 group), the I84V mutation appeared first in six of seven cultures. In contrast, the UMASS-6 through -10 series with the larger R1-2 group, the I50V mutation appeared first in eight of nine cultures (p=0.009, Fisher’s exact test; in this analysis, we included two cultures that did not reach at least 400 nM inhibitor concentration but did fix an initial set of mutations to increase our sample size [V3mut, V5mut], and we did not include two cultures where both I84V and I50V were initially fixed together [V2wt, V5wt]). We considered the possibility that the starting mixture of viruses in the mutant selection might skew the pathway selected, especially since the mutant pool included I84V but not I50V. However, in only one of the eight cultures with sufficient data from both selections was there a switch from the I84V pathway to the I50V pathway between the first and second selections (cultures of UMASS-6 with an R1-2 group). Thus, we conclude that the P1’-equivalent chemical structure of the inhibitor is a strong determinant of the resistance pathway selected. It is notable that one inhibitor could select for different pathways in two separate selections (also seen with DRV, see Figure 2—figure supplement 4) even when both major mutations are maintained in the viral population (Figure 3—figure supplement 2). This suggests that either pathway can provide some level of resistance to most if not all of these inhibitors, and that the chance addition of the initial compensatory mutations may determine which pathway becomes the major resistant population.

To examine the potential for linked mutations and infer the order in which mutations accumulated in the protease gene to confer high-level resistance, the abundance data from multiple selections that ended in one or the other pathway were pooled and compared. In this analysis, shown in Figure 4A, summary data for the selections resulting in the I84V pathway point up, with I84V reaching 100% penetrance by definition. Similarly, the summary data for those selections that fixed I50V are shown pointing down, with I50V reaching 100% penetrance. Several mutations are uniquely linked or at least strongly favored in each pathway, with I84V being linked to V32I, and I50V being linked to I47V, F53L, and I13V. A number of mutations appear in both pathways, although not with equal frequency: L10F, L33F, M46I, I54L, A71V, and V82I. Finally, other mutations appear less frequently, making it difficult to assign them to one of these categories. These results show that while some mutations are largely linked to one pathway, other mutations are often shared between the two pathways. Also, the variation in frequency of appearance of shared mutations in the two pathways suggests different levels of impact on resistance and/or fitness in the I84V vs. the I50V background for these mutations (e.g., L10F, M46I and A71V). In particular, the larger impact of I50V on fitness compared to I84V mutations (Henderson et al., 2012) is consistent with the earlier appearance and greater levels of fixation of the shared compensatory mutations L10F, L33F, M46I, and A71V in the cultures that followed the I50V pathway. In Figure 4B, we show a summary timeline of the ordered addition of mutations for each pathway and whether they are shared or unique to the pathway (in addition to their relative penetrance/final prevalence among the different cultures). In Figure 4—figure supplement 1, we show phylogenetic trees of the viral protease sequence from these longitudinal selections annotated to show where in the tree each of the fixed final mutations entered the viral population. These results emphasize that there are stochastic elements in the selection process that make each culture different in detail but that by pooling the data across the cultures underlying patterns are apparent.

Figure 4 with 1 supplement see all
Patterns of accumulation of amino acid substitutions associated with resistance during selection.

(A) Abundance data from selections for each indicated amino acid substitution were pooled and examined sequentially at different levels of drug concentration. Selections that reached the maximum concentration of at least 3000 nM were assessed longitudinally to allow for all timepoints to be averaged equally. Mutations from selections resulting in the I84V pathway (V1wt, V4wt, and V6mut) point up, with I84V reaching 100% penetrance by definition. Similarly, those selections that fixed I50V only (V3wt, V6-10wt) are shown pointing downward, with I50V reaching 100% penetrance. (B) Schematic representations of the addition of mutations found in I84V and I50V pathways. Each mutation’s position on the timeline of drug concentrations was estimated using the pools of selections from each I84V and I50V pathways. This summary does not reflect the exact order and linkage of mutations for each individual selections in that pathway but rather the generalizable pattern inferred by comparing across the multiple cultures. The size of the arrows represents how prevalent that mutation was among the cultures at the end of the selection process.

We next considered the possibility that the R2 constituent would provide additional selective pressure in the form of additional resistance mutations. We did not detect any novel mutations associated with any of the R2-1 through R2-5 inhibitors (Figure 2). However, as noted earlier, the inhibitors with R2-2 and R2-5 in the smaller R1-1 background (UMASS-2 and UMASS-5) resulted in the selection of resistant virus where both of the primary resistance mutations (I84V and I50V) were present and linked. The comparable R2 inhibitors in the R1-2 background (UMASS-7 and UMASS-10) selected for just the I50V primary mutation consistent with the larger R1-2 structure. Thus, for inhibitors with three of the R2 moieties (R2-1, R2-3, R2-4) the selection of the primary resistance pathway was determined by the size of the R1 moiety. This was also true for the R2 moieties R2-2 and R2-5 with the larger R1-2 moiety, selecting for I50V; however, these same two R2 moieties are able to change the interaction of the smaller R1-1 moiety to drive selection of both primary resistance mutations demonstrating a role for R2 in influencing the resistance pathway.

The chemical nature of R1 and R2 determines residual potency among the resistant variants

We considered several variables in examining the nature of the interaction between inhibitors and the resistant proteases: the extent to which the inhibitor structure (either R1 or R2) affected residual potency; and the extent to which the pathway (I84V or I50V) conferred the greatest resistance. As shown in Figure 5A, the inhibitors with the smaller R1-1 group showed a trend to be less potent than the inhibitors with the larger R1-2 group when tested against all of the resistant proteases. Inhibitors with the R2-2 group showed a trend toward greater residual potency against the mutant enzymes compared to the other inhibitors (Figure 5B). In looking at the individual pathways, the highly resistant enzymes with the I84V mutation remained more sensitive to the entire group of inhibitors than the enzymes with the I50V mutation (Figure 5C).

Analysis of Ki values for mutant enzymes.

Ki values were determined against all UMASS inhibitors (Table 1). Brackets above the bars represent significant p-values between the two groups using the unpaired t-test. Data were pooled in different ways for the following analyses: (A) the Ki values against the inhibitors with the larger R1-2 moiety (n = 34) were more potent against resistant proteases compared to the inhibitors with the smaller R1-1 moiety (n = 35) (B) Ki values for the inhibitors with the R2-2 group (n = 10) showed a trend toward being more potent against the highly mutated proteases compared to the other R2 groups (n = 40). (C) Ki values for enzymes with the I50V mutation (n = 33) showed greater resistance to the inhibitors compared to enzymes with the I84V mutation (n = 22). (D) The R1-2 moiety provided increased potency to enzymes with the I84V mutation (n = 20) but not the I50V mutation (n = 20). (E) The R2-2 moiety was more potent against the enzymes with I84V mutation (n = 10) compared to the enzymes with the I50V mutation (n = 10). The unpaired t-test was used to assess differences in Ki values. Error bars show standard deviation.

We next linked the two pathways to the specific structural features of the inhibitors. The resistant proteases with I50V had similar Ki values to both the R1-1 and R1-2 inhibitors, while the proteases with I84V were more sensitive (lower Ki) to the larger R1-2 inhibitors (Figure 5D). Similarly, the increased potency of the R2-2 inhibitors over the rest of the inhibitors was seen against the enzymes carrying I84V but not those with I50V (Figure 5E). These results are consistent with the smaller R1-1 inhibitors selecting for the I84V pathway and the larger R1-2 inhibitors selecting for the I50V pathway, and with the R2-2 inhibitors adding additional potency that is retained even after selection for resistance, most apparent with the I84V pathway.

Residual potency dependence of inhibitor structure could also be seen in the EC50 values of the resistant virus pools. When the EC50 values against the virus pools using the I84V or the I50V pathway were compared for all of the inhibitors, the inhibitors with the larger R1-2 group were more potent, retaining on average lower EC50 values compared to the EC50 values for the inhibitors with the smaller R1-1 group (Figure 6A). Inhibitors with the R2-2 group showed a trend toward greater potency compared to the other R2 groups (Figure 6B), consistent with what was observed in enzymatic assays. When we considered the effects based on the selection pathway, we observed that viruses in the I50V pathway had a higher level of resistance to these inhibitors than the viruses in the I84V pathway (Figure 6C). The higher level of resistance for the viruses using the I50V pathway was due to the fact that these viruses were similarly resistant to the inhibitors with either the R1-1 or the R1-2 groups; in contrast, the viruses using the I84V pathway conferred a greater level of resistance to the inhibitors with the smaller R1-1 group while the inhibitors with the larger R1-2 group retained a higher level of potency (Figure 6D). Finally, there was a trend for viruses in either pathway to be less resistant to the inhibitors with the R2-2 group (Figure 6E).

Analysis of EC50 values for mutant virus cultures.

EC50 values were determined for a subset of the selected virus cultures against a panel of inhibitors (n = 50). For (A–E), the EC50 data were pooled using the same methods and sample numbers as in Figure 5. The highest level of resistance recorded was 100 µM. The Mann–Whitney rank-sum test was used to assess differences in EC50 values. Error bars show standard deviation.

DRV favors the I84V pathway

DRV has a butyl group at the R1 position, smaller than the R1-1 methylbutyl in the UMASS-1 through -5 series. We carried out five selections with DRV where the final inhibitor concentration reached greater than 1 μM. One DRV selection was carried out in parallel with each of the two different selections with the UMASS inhibitors (Figure 2A); both of these selections resulted in the appearance of the I84V mutation as defining the resistance pathway. Three additional selections were done in parallel with DRV and starting with the mixture of the 26 isogenic mutants; all three selections reached the level of 5 μM as the highest drug concentration. Two of these selections used the I84V pathway to resistance, while one of the selections used the I50V pathway to resistance (Figure 2—figure supplement 4). Thus, four of five independent selection with DRV favored the resistance pathway I84V, associated with a smaller R1 constituent. These results add further evidence for the size of the R1 moiety strongly influencing the resistance pathway.

The I84V pathway and the I50V pathway differentially select for Gag cleavage site mutations

Cleavage site mutations are seen during selection for resistance to protease inhibitors (Su et al., 2019). This can be viewed as protease-substrate coevolution, and the effect is most apparent at the cleavage sites flanking the spacer peptide SP2 in Gag (NC/SP2 and SP2/p6) (Kolli et al., 2006; Kolli et al., 2014; Kolli et al., 2009; Lee et al., 2012; Özen et al., 2011; Özen et al., 2012; Özen et al., 2014; Prabu-Jeyabalan et al., 2004). We sequenced the protease cleavage sites encoded in the viral gag gene in the pools of selected viruses where the inhibitor concentration had reached a level of greater than 1 µM (Figure 7A). An analysis of four cultures that had I84V as the major resistance mutation showed they all had a mutation at the NC/SP2 cleavage site at position P2, with a change from the wild-type alanine amino acid to either of the larger aliphatic amino acids valine or isoleucine. In addition, three of the four I84V cultures had a mutation at the adjacent SP2/p6 cleavage site, either at P1' (leucine to phenylalanine) or P5' (proline to leucine). Conversely, all seven cultures where the I50V mutation was the major resistance mutation there was a mutation in the SP2/p6 cleavage site, but not in the NC/SP2 site. Finally, in the three cultures where the protease evolved both the I50V and I84V mutations, Gag mutations were observed only at the SP2/p6 cleavage site (the presence or absence of the NC/SP2 mutation in the I84V cultures but not in the I84V/I50V cultures has a p value of 0.03 in a Fisher’s exact test). Two of three cultures with both I84V and I50V had both the P1' (leucine to phenylalanine) and P5' (proline to leucine) mutations. In contrast, mutations at both P1' and P5' together were underrepresented in the cultures with only I50V. An examination of the modeling (Figure 7B) suggests that the NC/SP2 mutation at P2 in the presence of the I84V mutation may engage I50 to replace the lost interaction with I84V as it moves away from the P2 sidechain by shortening (I to V); this would explain the absence of this cleavage site mutation in the double protease mutant (I50V/I84V) since the shortening of both protease sidechains provides too little interaction with the longer P2 sidechain. The P1' and P5' mutations in the SP2/P6 site appear to act in a complementary way as typically only one is seen with either the I50V or the I84V mutant proteases (Figure 7C). The SP2/p6 P1' and P5' mutations engage the protease by different mechanisms (Özen et al., 2014) suggesting their effects are additive; thus in these selections the single mutation at P1' or P5' may be sufficient to recover an appropriate rate of cleavage with either the I84V or the I50V mutant protease, but the additive effects of the P1' and P5' mutations may be needed to rescue cleavage by the I50V/I84V mutant protease.

Protease cleavage site mutations observed after selection for high-level resistance.

(A) Culture names, final inhibitor concentration, and resistance pathway are shown. Changes in amino acid sequence are shown for the NC/SP2 and SP2/p6 cleavage sites. (B) P2 substitution in the NC/SP2 cleavage site with the I84V mutation modeled. Subsite structure was modeled using the original structure (PDB: 1KJH) and mutating the P2 position in the subsite pocket from alanine to valine. (C) P1' change in the SP2/p6 cleavage site with either the I84V mutation or the I50V mutations modeled. Subsite structure was modeled using the original structure (PDB: 1KJF) and mutating the P1’ position from leucine to phenylalanine.

Discussion

DRV is a notable PI both for its tight binding to the HIV-1 protease and for its ability, with boosting doses of ritonavir, to reach micromolar levels of drug concentration in the blood. These features are important as resistance to PIs typically requires multiple mutations affecting both resistance and the rescue of enzyme activity/fitness lost with primary resistance mutations (Arribas et al., 2012; Clemente et al., 2004; Henes et al., 2019; Mahalingam et al., 2002; Muzammil et al., 2003; Ragland et al., 2014; Wensing et al., 2010). While we were successful in selecting for resistance to DRV in cell culture, the spectrum of mutations observed largely overlap those seen in previous studies based on testing individual mutations for their effect on DRV sensitivity (de Meyer et al., 2008; King et al., 2004; Rhee et al., 2003). In this work, we have explored chemical modifications to the DRV scaffold. We found that modification of the P1'/R1 chemical structure to be a larger aliphatic group favored the use of the I50V resistance pathway compared to the smaller DRV structure or an intermediate-sized structure, forcing the virus to use the more deleterious I50V mutation compared to the I84V mutation used with the smaller P1'/R1 structures. Conversely, we identified a subset of the P2'/R2 groups that contributed to residual potency even in the resistant proteases but did not select for novel resistance mutations; however, several of the R2 groups that conferred higher potency selected both major primary mutations but only with the smaller R1-1 methylbutyl group. Thus, chemical modifications at both of these inhibitor positions yielded improvements over DRV and an unexpected linkage between the R1 and R2 groups in that anchoring the R2 group to the protease backbone enhanced the potency of the intermediate-sized R1 group, at least as assessed by its selective pressure.

The potency of these inhibitors can be inferred by the high genetic barrier to high-level resistance. The maximum drug concentrations of DRV achieved in the blood are nearly 1000-fold above the EC50 in cell culture (Kurt Yilmaz et al., 2009), and therapy failure with resistance mutations for therapies involving DRV are rare and most often occur in people who had previously failed therapy with other protease inhibitors with drug resistance. An examination of the Stanford University HIV Drug Resistance Database (https://hivdb.stanford.edu/) shows a catalog of approximately 25,000 reported HIV-1 sequences with at least one protease resistance-associated mutation, with the majority of these linked to some level of phenotypic resistance to DRV. For sequences associated with a greater than 200-fold change in resistance to DRV, there were an average of 5.4 resistance-associated mutations (with a maximum of seven mutations). Thus, these levels approach the levels of resistance we have obtained and highlight the potential for creating an even higher genetic barrier to resistance in vivo.

P1'/R1 modification

DRV has a relatively small butyl group at this position that most often selected for an I84V mutation. When we extended this moiety by a single methyl group to (S)-2-methylbutyl, we obtained a similar resistance selection profile. However, by extending it with an additional methyl group to (S)-2-methylbutyl or 2-ethyl-n-butyl create an inhibitor that switched the preferred selection to the I50V pathway. In Figure 8A–C are shown representations of structures of a subset of the inhibitors with the different P1'/R1 groups with the wild-type protease, the I84V mutant protease, and the I50V mutant protease. As we have previously discussed (Lockbaum et al., 2019; Mittal et al., 2013), in these structures it is apparent that the P1'/R1-1 group is directed at the protease I84 sidechain, consistent with the shortening of this sidechain as a major resistance mechanism. Conversely, the longer P1'/R1-2 group is oriented into the space between I84 and I50, thus drawing in I50V as the pathway to resistance. As noted in the longitudinal analysis of the selection pathway, the I50V resistance pathway presents additional challenges to the virus in the more rapid accumulation of compensatory mutations (Figure 4). Under the circumstances of rapidly declining viral load during the initiation of therapy, this need for additional mutations would represent an enhanced genetic barrier.

Structural interpretation of protease inhibitor resistance and of residual inhibitor potency.

(A–C) Hydrophobic packing in the S1’ subsite in complex with R1 structural groups; darunavir is shown in cyan, UMASS-1 in magenta, and UMASS-6 in salmon. (A) The two forms of R1 and darunavir (DRV) are shown packing against wild-type protease at I84 and I50 DRV, UMASS-1, UMASS-6 in wild-type protease (PDB: 6DGX [cyan], 6DGY [magenta], 6DGZ [salmon]). (B) Those same inhibitors packing against the I84V mutant (PDB: 6DH0 [cyan], 6DH1 [magenta], 6DH2 [salmon]). (C) Those same inhibitors packing against the I50V protease variant (PDB:, 6DH6 [salmon], 6DH7 [magenta], 6DH8 [salmon]). (D–F) Binding interactions with Asp29 and Asp30 of the protease S2' subsite in complex with R2 structural groups. Inhibitor/enzyme interactions are shown with black dashed lines representing hydrogen bonds (<3.0 Å). (D) UMASS-1, UMASS-3, UMASS-4, and UMASS-5 in WT protease (PDB: 3O99 [salmon], 3O9B [purple], 309C [orange], 309D [slate blue]). Water molecules are shown as red dots. (E) UMASS-2 in WT protease (PDB: 3O9A). (F) SP1/NC S2’ subsite representation of peptide substrate complexed with an inactive form of wild-type protease (PDB: 1KJ7) showing an interaction between the P2' glutamine and the protein backbone at Asp29 and Asp30.

PR2'/R2 modification

The inhibitors with two of the R2 modifications (R2-2 and R2-5) showed greater potency against the mildly resistant proteases (Figure 2—figure supplement 2) and against the wild-type virus (Table 1). This effect was also evident as a trend for inhibitors with the R2-2 group against the highly resistant proteases (Figure 5E) and the highly resistant viruses (Figure 6E). When we examined the interactions between protease and inhibitor, there is a clear structural explanation (Figure 8D–F). The less potent inhibitors interact with the protease side chains and backbone at D29 and D30 through a network of water molecules. In contrast, with the more potent R2-2 group the water molecules are replaced with direct interactions between inhibitor and protease in the S2' subsite (Figure 8E). This direct interaction more closely mimics the interaction made by the glutamine P2' side chain in the optimized protease cleavage site at the SP1/NC boundary (Figure 8F). Thus, the ability to replace the water-mediated interactions with a more rigid framework that is interacting with the protease backbone deep in the S2' pocket appears to be a unique feature of the inhibitors with R2-2. Improved potency in another series of PIs was also reported for this R2-2 P2' moiety (Zhu et al., 2020).

While the different R2 structures did not select for any novel resistance mutations, we did note that the two inhibitors with the greatest potency as enzymes (UMASS-2 and UMASS-5) were the only inhibitors that selected for I84V and I50V linked on the same genome (Figure 2A). This represents a third distinct outcome since in these cases the smaller methylbutyl R1-1 is also able to engage I50. As noted above, the R2-2 group is directly anchored to the protease backbone. We examined the effect this has on the R1-1 orientation. However, we did not see any difference in the placement of the R1-1 group when comparing UMASS-1 (which preferentially selects for I84V) and UMASS-2 (which selects for linked I84V/I50V). It is possible that anchoring to the backbone with R2-2 for UMASS-2 makes the R1-1 group less mobile, reducing the magnitude of the effect of the I84V mutation allowing the inhibitor to maintain some level of interaction with I50, although this inference was not tested directly.

Protease mutation networks and compensatory mutations in the resistant pathways

In most of the selections, either the I84V or the I50V mutation was largely fixed by the time the inhibitor concentration reached 10 nM in the culture (Figure 3, Figure 3—figure supplement 2). The viral cultures went through cyclical changes in population diversity as subsequent mutations were added (Figure 3A). Because of the large number of selections, it was possible to pool data and see trends in the way mutations were added in these two pathways. Resistance mutations that were detected in multiple selections summed to at least 16 positions (Figure 2A). While these mutations are well known, their relationships to each other, and as a function of selective pressure, are less well understood. The data in Figure 4 show strong linkage between I84V and V32I, while I50V shows linkage with I47V and F53L. These linked compensatory mutations are each close to the primary resistance mutation, suggesting they interact directly with the primary resistance mutation to adjust its position within the subsite to limit interaction with the inhibitor and/or improve interaction with the substrate.

Other mutations appeared often in both pathways but could be found earlier and/or more frequently in one pathway: in the I84V pathway G16E and I54L were favored; in the I50V pathway L10F, I13V, L33F, M46I, L63P, and A71V were favored. In contrast to the linked compensatory mutations, the shared compensatory mutations are often more distant in the structure from the primary resistance mutation. Mutations distal from the active site modulate the enzymatic activity and fitness by altering the dynamic ensemble (Foulkes-Murzycki et al., 2007; Henes et al., 2019; Leidner et al., 2021; Ragland et al., 2014; Ragland et al., 2017; Whitfield et al., 2020) of the enzyme (i.e., conformations sampled). While these changes can sometimes enhance enzymatic activity, they also often significantly contribute to high levels of resistance (Henes et al., 2019; Matthew et al., 2021) as is observed in these selections. As can be seen in Figure 2E, there is a strong correlation between increasing resistance (lower Ki) and lost catalytic efficiency (larger kcat/Km). This suggests that compensatory mutations may attenuate fitness loss but do not restore it, or they impact resistance, or both. The role of mutations outside of the active site has been of interest for a long time (Foulkes-Murzycki et al., 2007; Kovalevsky et al., 2006; Louis et al., 2011; Ode et al., 2006; Shen et al., 2010). In Figure 4B, it can be seen that the same compensatory mutation is part of two different resistance pathways but with differing selective pressure to add the mutation, hinting at the complexity of the role of such mutations and their pathway-specific contributions.

In summary, using a number of lengthy selections for resistance to a series of structurally related PIs we have been able to identify the size of the hydrophobic chain at the P1'-equivalent position of the inhibitor as the major determinant for selection of the I84V pathway with a smaller P1'-equivalent chain, including for DRV, versus a larger P1'-eqivalent hydrophobic chain driving selection of the I50V pathway. Also, one of the P2' groups tested gave a higher level of potency even in the face of high-level resistance by creating a direct interaction with the protease backbone. This also led to coselection of I84V and I50V as linked mutations. These chemical changes in the inhibitor increase the genetic barrier for the evolution of resistance and emphasize the potential utility of what a fifth-generation HIV-1 PI could add to regimens with reduced drug complexity.

DRV has been used as the starting point for other approaches in enhancing potency (Matthew et al., 2021). These other studies have examined similar, but not identical, structural changes to DRV as described here. There is a cautionary note in comparing between different inhibitors that can be seen in Figure 2—figure supplement 3, where the nature of the chemical structure at one site (R2) reverses the order of potency of structures at another site (R1) in the inhibitor. Thus, comparisons between DRV derivatives where there are two or more differences may be problematic in terms of trying to infer parallels or differences. The original description of DRV included a compound with the same R2-2 that was also highly potent (Surleraux et al., 2005), and R2-2 has been tested in the context of other changes in DRV (Delino et al., 2018). R2-4 has been tested on the DRV backbone (GRL-98065) that was further modified at the P1-equivalent position to give the inhibitor brecanavir (Amano et al., 2007; Hazen et al., 2007). The R2-5 P2'-equivalent has previously been shown to have interactions with PR amino acids 29/30 in the S2' subsite (Bulut et al., 2020). The DRV backbone has also been used as the basis for more extensive chemical changes giving rise to a number of potent HIV-1 protease inhibitors (Aoki et al., 2017; Cihlar et al., 2006; Ghosh et al., 2008; Ghosh et al., 1998; Ghosh et al., 2020; Ghosh et al., 2018a; Ghosh et al., 2006; Ghosh et al., 2018b; Ghosh et al., 2017; Hazen et al., 2007; Miller et al., 2006; Miller et al., 2005; Miller et al., 2004; Nalam et al., 2013; Rusere et al., 2019).

DRV represents a potent antiviral due in large part to its ability to achieve drug levels in the blood that are far above the EC50 and the fact that high-level resistance requires many mutations (a high genetic barrier). However, we have shown that resistance usually follows the less deleterious I84V pathway. The UMASS-2 inhibitor anchors the R2-2 group to the protease backbone and allows the intermediate-sized R1-1 group to co-select both the I84V and I50V pathways, creating an even higher genetic bar. These improvements could become important if protease inhibitors are moved to dosing using a long acting depot where drug mass becomes limiting. The enhanced potency of UMASS-2 could potentially be traded for reduced maximum drug levels allowing a given mass of drug to provide antiviral coverage for a longer period of time.

Materials and methods

Cell lines and viruses

Request a detailed protocol

CEMx174 cells were maintained in RPMI 1640 medium with 10% fetal calf serum and penicillin-streptomycin. TZM-bl and 293T cells were maintained in Dulbecco’s modified Eagle-H medium supplemented with 10% fetal calf serum and penicillin-streptomycin. A wild-type virus stock NL4-3 was prepared by transfection of the pNL4-3 plasmid (purified using the QIAGEN Plasmid Maxikit) into HeLa cells. For the mixture of isogenic mutant viruses, the following NL4-3 variants were created, each with a single mutation in the protease with this mixture forming the virus pool for the initiation of selection with mutant viruses: L10I, K20R, K20I, L24I, D30N, V32I, M36I, M46I, M46L, I47V, G48V, F53L, I54V, I62V, L63P, A71T, A71V, G73S, V77I, V82A, V82T, I84V, N88D, N88S, L90M, I93L (Henderson et al., 2012). The cell lines used in this study came from the NIH HIV Reagent Program or ATCC, and were initially expanded and frozen down. Cells from these low-passage frozen stocks were thawed and used in experiments, and typically replaced within 1 y. Authentication of cell lines was based on assessments of the providers. Cell phenotypes were monitored for their ability to generate virus after transfection (293Ts), support viral replication (CEMx174), or report viral infection (TZM-bl). The following cell lines were obtained through the NIH HIV Reagent Program, Division of AIDS, NIAID, NIH: 174xCEM Cells, ARP-272, contributed by Dr. Peter Cresswell and TZM-bl Cells, ARP-8129, contributed by Dr. John C. Kappes, Dr. Xiaoyun Wu and Tranzyme Inc. The HEK-293Ts cells were obtained from ATCC, CRL-11268, and used directly upon receipt to created an expanded stock. Certificates of analysis are available on ATCC to include HEK293Ts. Cells obtained from the frozen stock were periodically tested for mycoplasma and were consistently negative.

Selections

Request a detailed protocol

An aliquot of 3 × 106 CEMx174 cells was incubated at 37°C for 2–3 hr with 250 µl of a virus stock generated from the HIV-1 infectious molecular clone pNL4-3. The culture volume was then brought to 10 ml with RPMI medium. Each flask received one of the following inhibitors at escalating concentrations: UMass1, UMass2, UMass3, UMass4, UMass5, UMass6, UMass7, UMass8, UMass9, UMass10, DRV, and no drug (ND). After 48 hr and every 48 hr after, the cells were pelleted by centrifugation and 10 ml of fresh medium and inhibitors were added. When the culture had undergone extensive cytopathic effect (CPE) indicative of viral replication, the supernatant medium and the cells were harvested separately and stored at –80°C. The virus-containing supernatant was used to start the next round of infection, and after several rounds at the initial concentration, the inhibitor concentration was increased 1.5-fold at each subsequent round of virus passage. The level of resistance (50% inhibitory concentration [EC50]) of the single inhibitor-selected virus pools was determined by a TZM infection assay in which the PI is added to productively infected cells and the titers of supernatant virus made in the presence of the inhibitor are determined.

TZM infection assay

Request a detailed protocol

PI dilutions were prepared by taking 10 µM stocked and performing a fivefold serial dilution using RPMI media (final drug concentration is 100 µM). One dilution of drug was added to each well of a 24-well plate and repeated so each virus would have a full set of dilutions. Viruses for the assay were made by seeding 3 × 106 CEM cells in a 24-well plate and incubating with 250 µl of virus at 37°C for 2–3 hr before bringing the culture to 10 ml with RPMI media. After 48 hr, the medium was changed and repeated every 48 hr after until the culture had undergone CPE. Infected CEM cells were collected and diluted so that 1 ml of cells could be plated in each well containing a unique drug dilution. Then 24 hr later the virus supernatant was collected from each well followed by filtering through a 0.45 µM filter then placed in –80°C. Viruses were thawed and added to 96-well plates in triplicate. TZM-bl cells were collected and diluted to a concentration of 2 × 105 cells/ml, 100 µl were added on top of the pre-plated viruses. Plates were kept in 37°C, 5% CO2 in an incubator for 48 hr. After 48 hr, the cells in the plates were lysed by removing the medium, washing two times with 100 µl PBS, and then lysed with 1× lysis buffer (made from 5× Promega Firefly Lysis Buffer). Plates were frozen for at least 24 hr and then thawed for 2 hr before analyzing with Promega Firefly Luciferase Kit on a luminometer. Data was analyzed with Prism 7 to fit sigmoidal dose–response curves.

DNA preparation and amplification of the protease-coding region

Request a detailed protocol

Total cellular DNA was isolated from infected cell pellets by using the QIAamp blood kit (QIAGEN). The protease-coding domain of viral DNA was amplified by nested PCR. The PCR conditions are available upon request. PCR products were purified by using QIAquick PCR purification kit (QIAGEN) and directly sequenced or cloned into the pT7Blue vector (Novagen) and sequenced.

Primer-ID deep sequencing of viral RNA

Request a detailed protocol

We used the PID protocol to prepare MiSeq PID libraries with multiplexed primers. Viral RNA was extracted from plasma samples using the QIAamp viral RNA mini kit (QIAGEN, Hilden, Germany). Complementary DNA (cDNA) was synthesized using a cDNA primer mixture targeting protease (PR) with a block of random nucleotides in each cDNA primer serving as the PID, and SuperScript III RT (Thermo Fisher). After two rounds of bead purification of the cDNA, we amplified the cDNA using a mixture of a forward primer that targeted the upstream coding region, followed by a second round of PCR to incorporate the Illumina adaptor sequences. Gel-purified libraries were pooled and sequenced using the MiSeq 300 base paired-end sequencing protocol (Illumina). The sequencing covered the HIV-1 PR region (HXB2 2648-2914, 3001-3257).

We used the Illumina bcl2fastq pipeline for the initial processing and constructed template consensus sequences (TCSs) with TCS pipeline version 1.33 (https://github.com/SwanstromLab/PID) (Zhou, 2019). We then aligned TCSs to an HXB2 reference to remove sequences not at the targeted region or that had large deletions. We used the Entropy tool at LANL (https://www.hiv.lanl.gov/content/sequence/ENTROPY/entropy.html) to calculate entropy for each specimen. The sequencing data is available at NIH Sequencing Read Archive (SRA) under BioProject ID PRJNA853351.

Protease expression and purification

Request a detailed protocol

The highly mutated, resistant, protease variant genes were purchased on a pET11a plasmid with codon optimization for protein expression in Escherichia coli (Genewiz). A Q7K mutation was included to prevent autoproteolysis (Rosé et al., 1993). The expression, isolation, and purification of WT and mutant HIV-1 proteases used for enzymatic assays were carried out as previously described (Henes et al., 2019; King et al., 2002; Özen et al., 2014). Briefly, the gene encoding the desired HIV-1 protease was subcloned into the heat-inducible pXC35 expression vector (ATCC) and transformed into E. coli TAP-106 cells. Cells grown in 6 l of Terrific Broth were lysed with a cell disruptor twice, and the protein was purified from inclusion bodies (Hui et al., 1993). Inclusion bodies, isolated as a pellet after centrifugation, were dissolved in 50% acetic acid followed by another round of centrifugation at 19,000 rpm for 30 min to remove insoluble impurities. Size-exclusion chromatography was carried out on a 2.1 l Sephadex G-75 Superfine (Sigma Chemical) column equilibrated with 50% acetic acid to separate high molecular weight proteins from the desired protease. Pure fractions of HIV-1 protease were refolded using a tenfold dilution of refolding buffer (0.05 M sodium acetate at pH 5.5, 5% ethylene glycol, 10% glycerol, and 5 µM DTT). Folded protein was concentrated to 0.5–3 mg/ml and stored. The stored protease was used in KM and Ki assays.

Enzymatic assays

Km assay

Request a detailed protocol

Km values were determined as previously described (Henes et al., 2019; Lockbaum et al., 2019; Matayoshi et al., 1990; Windsor and Raines, 2015). Briefly, a 10-amino acid substrate containing the natural MA/CA cleavage site with an EDANS/DABCYL FRET pair was dissolved in 8% DMSO at 40 nM and 6% DMSO at 30 nM. The 30 nM substrate was 4/5 serially diluted from 30 nM to 6 nM. HIV-1 protease was diluted to 120 nM and, and 5 µl were added to the 96-well plate to obtain a final concentration of 10 nM. Fluorescence was observed using a PerkinElmer Envision plate reader with an excitation at 340 nm and emission at 492 nm, and monitored for 200 counts. A FRET inner filter effect correction was applied as previously described (Liu et al., 1999). Data corrected for the inner filter effect was analyzed with Prism7.

Ki assay

Request a detailed protocol

Enzyme inhibition constants (Ki values) were determined as previously described (Henes et al., 2019; Lockbaum et al., 2019; Matayoshi et al., 1990; Windsor and Raines, 2015). Briefly, in a 96-well plate, inhibitors were serially diluted down from 2000 to 10,000 nM depending on protease resistance. All samples were incubated with 5 nM protein for 1 hr. A 10-amino acid substrate containing an optimized protease cleavage site (Windsor and Raines, 2015), purchased from Bachem, with an EDANS/DABCYL FRET pair was dissolved in 4% DMSO at 120 µM. Using a PerkinElmer Envision plate reader, 5 µl of the 120 µM substrate were added to the 96-well plate to a final concentration of 10 µM. Fluorescence was observed with an excitation at 340 nm and emission at 492 nm and monitored for 200 counts. Data was analyzed with Prism7.

Data availability

The sequencing data (Figure 2, 3, and 4) is available at NIH Sequencing Read Archive (SRA) under BioProject ID PRJNA853351.All source data files for enzymatic Ki and Km (Table 1, Figure 2 and 5) have been uploaded to the Carolina Digital Repository: Swanstrom, Ron, and Ean Spielvogel. Km and Ki Dataset for Selection of Hiv-1 for Resistance to Fifth Generation Protease Inhibitors Reveals Two Independent Pathways to High-level Resistance. 2022.All source data files for EC50 inhibition curves (Figure 2 and 6) have been uploaded to the Carolina Digital Repository: Swanstrom, Ron, and Ean Spielvogel. Ec50 Dataset for Selection of Hiv-1 for Resistance to Fifth Generation Protease Inhibitors Reveals Two Independent Pathways to High-level Resistance. 2022.

The following data sets were generated
    1. Henes M
    2. Kosovrasti K
    3. Nalivaika EA
    4. Kurt Yilmaz N
    5. Schiffer CA
    (2022) Carolina Digital Repository
    Km and Ki Dataset for Selection of HIV-1 for Resistance to Fifth Generation Protease Inhibitors Reveals Two Independent Pathways to High-Level Resistance.
    https://doi.org/10.17615/rb7r-tx62
    1. Spielvogel E
    2. Sondgeroth A
    3. Swanstrom R
    (2022) Carolina Digital Repository
    Display all details of EC50 Dataset for Selection of HIV-1 for Resistance to Fifth Generation Protease Inhibitors Reveals Two Independent Pathways to High-Level Resistance.
    https://doi.org/10.17615/3697-3v58
    1. Spielvogel E
    2. Zhou S
    3. Swanstrom R
    (2022) NCBI BioProject
    ID PRJNA853351. Sequencing Data for: Selection of HIV-1 for Resistance to Fifth Generation Protease Inhibitors Reveals Two Independent Pathways to High-Level Resistance.

References

    1. Pellegrin I
    2. Wittkop L
    3. Joubert LM
    4. Neau D
    5. Bollens D
    6. Bonarek M
    7. Girard P-M
    8. Fleury H
    9. Winters B
    10. Saux M-C
    11. Pellegrin J-L
    12. Thiébaut R
    13. Breilh D
    14. ANRS Co3 Aquitaine Cohort
    (2008)
    Virological response to darunavir/ritonavir-based regimens in antiretroviral-experienced patients (PREDIZISTA study)
    Antiviral Therapy 13:271–279.
    1. Rosé JR
    2. Salto R
    3. Craik CS
    (1993)
    Regulation of autoproteolysis of the HIV-1 and HIV-2 proteases with engineered amino acid substitutions
    The Journal of Biological Chemistry 268:11939–11945.

Decision letter

  1. Owen Pornillos
    Reviewing Editor; University of Virginia, United States
  2. Sara L Sawyer
    Senior Editor; University of Colorado Boulder, United States
  3. Dmitry Lyumkis
    Reviewer; Salk Institute for Biological Studies, United States

Our editorial process produces two outputs: (i) public reviews designed to be posted alongside the preprint for the benefit of readers; (ii) feedback on the manuscript for the authors, including requests for revisions, shown below. We also include an acceptance summary that explains what the editors found interesting or important about the work.

Decision letter after peer review:

Thank you for submitting your article "Selection of HIV-1 for Resistance to Fifth Generation Protease Inhibitors Reveals Two Independent Pathways to High-Level Resistance" for consideration by eLife.

Your article has been reviewed by 3 peer reviewers, and the evaluation has been overseen by a Reviewing Editor Owen Pornillos and Sara Sawyer as the Senior Editor. The following individual involved in the review of your submission has agreed to reveal their identity: Dmitry Lyumkis (Reviewer #1).

The reviewers have discussed their reviews with one another, and the Reviewing Editor has drafted this to help you prepare a revised submission.

Essential revisions:

For publication in eLife, the paper needs to be revised in order to improve readability (for both experts and non-experts) and to address a number of issues brought up by the Reviewers.

1) Readability. (a) Please include in the introduction a brief "guide" to the P1', P2' positions. (b) Please utilize the eLife system of figures and figure supplements (see published examples) to consolidate the main figures and supplementary figures. (c) Presentation of selection data in e.g., Figures1, 3, and S4 need to be improved, for example using a mutational "tree" diagram that illustrates relationships between the primary and accessory mutations.

2) Discussion and data interpretation. (a) The Reviewers are all in agreement that there is more to be said about the accessory variants, how these arise, and how these could contribute to resistance, beyond describing their linkages to the primary mutations. This is key to the paper, but unfortunately, the data are presented in a way that limits comprehension, even by experts. Please determine a more effective way to illustrate (perhaps something similar to an evolutionary tree) and discuss the concepts of key pathways (I84V and I50V), their establishment during selection, and the further addition of and roles of accessory mutations. (b) Address how insights from this study explain the clinical utility and limitation of DRV. (c) Going beyond DRV, discuss design strategies for minimizing drug resistance, including limits on absolute sizes of R1 and R2. (d) Address the potential roles of cleavage site mutations in Gag.

Reviewer #1 (Recommendations for the authors):

My comment would be to provide some broader set of rules through which resistance develops, starting with the WT, beyond the 184V vs 150V variants themselves and the pathway switch mediated by modifications at the P1' region, in particular for the accessory mutations. The authors have generated an extensive amount of data from the passaging experiments, using multiple different drugs. Is it possible to devise some conglomerate ordering for resistance onset? Or place the accessory mutations into a broader context? For example, looking at the data in Figure 2, it appears that V82I and M46I somewhat consistently emerge in the initial or mid-stages of the passaging experiments, and they're almost universally present in Figure 1. I47V and M71V, which are also nearly universally present in Figure 1, appear to be more variable in Figure 2, occasionally arising toward the end of the passaging experiment. Are there any other patterns that emerge?

Reviewer #2 (Recommendations for the authors):

– The authors report an extended analysis of the development of resistance mutations in HIV protease sequence variation during long-term treatment with darunavir. Long–term drug therapy may also select for mutations in cleavage sites in gag/pol where protease cleaves. Have the authors investigated the presence of cleavage site mutations in the gag sequence using single genome amplification (SGA) or the like? Are all of the changes responsible for resistance attributable to changes in protease only?

– In Figure 1A, the authors use NGS with primer ID to quantify the abundance of drug-resistant variants. These studies are well done and represent the gold standard for measuring levels of different variants. As they show, the highly resistant variants may predominate but other variants are present as well, in some cases seems like these other variants are at substantial levels, and additional discussion of these data would be useful. For instance, in studies of UMASS–2, the most abundant variant is only 37% of the total; this observation seems consistent with the data in figure 2A, where the authors find relatively high Shannon entropy for UMASS-2, reflecting considerable genetic diversity. In contrast, in studies of UMASS-8, where the most abundant variant is also low (only 55.4%), figure 2A shows a relatively low shannon entropy. Were there other closely related variants without DRMS that would explain both the relatively low abundance of resistant variants, but also explain the low entropy levels? Additionally, in these long–term experiments, were there any WT variants present after prolonged drug exposure?

Elsewhere, the authors suggest the presence of widespread recombination in these cultures – might recombination also explain some of the variations in Figure 1A? Considering several log decreases in infectivity demonstrated in Figure 1 D, is recombination contributing to ongoing transmission during drug exposure?

– Figure 2A – Regarding the Shannon entropy determinations – were parallel studies done with darunavir? if so, how do those shannon entropy profiles compare with those measured for UMASS compounds and can they be included in the figure?

– The authors have important new insights in their structural studies – can they place boundaries on the absolute sizes of R1 or R2 for future drug design?

– Previously these authors have provided useful models of genetic variation in vitro (Boucher et al., 2019), including estimating selection coefficients. Can the authors discuss their new data in light of these modeling studies and identify selection coefficients for these variants?

– Do the data from these studies provide insight to explain the clinical utility and limitations of Darunavir as an initial or simplification therapy? For instance:

a) the two–drug combination of Darunavir and Raltegravir is highly effective in suppressing HIV, but this combination is not recommended in individuals with HIV viral loads exceeding 100,000 copies/ml plasma, because of the increased risk for the development of resistance.

b) individuals with baseline protease resistance simplifying their regimens to darunavir+raltegravir did not result in increased frequency of protease resistance (Vizcarra et al., 2019).

– Can you clarify the tinting in Figure S2?

As written:

"UMASS-1, -3, -6, -8 show less potency towards these variants and is shaded in a red tint matching R2 groups. Are the tints supposed to be the same for R2 groups?"

But in my version:

UMASS 1 and 3 have an identical tint, but different R2 groups, and

UMASS 6, and 8 have identical tint but different R2 groups.

In table 1

UMASS-1 and -6 have identical R2 groups

UMASS-3 and -8 have identical R2 groups

also

UMASS-2 and 5 have identical tint but different R2 groups

UMASS-7 and 10 have identical tint, but different R2 groups

UMSS 10 and 5 have identical R2 groups

UMASS 2 and 7 have identical R2 groups

It's not clear whether this could be an ADOBE issue or something else is meant by this text

and figure.

–Line 474 "demonstrate" should be "demonstrated".

Reviewer #3 (Recommendations for the authors):

The manuscript is comprehensive, well–written, and informative, yet dense and with some figures that readers may not find informative. Comments are discussed below.

1. Perhaps it was just me, but I found that figure layout and organization difficult to get the necessary take–home points. In table 1, it might be helpful for the reader to have viral EC50s for the initial mutations, as shown, and the endstage predominates viral populations as well. I am aware that in some cases the final population is a viral mixture of two genotypes and this should be noted.

2. Given that a major focus of the report is mutational selections starting from wildtype from DRV and UMASS compound series to the ending of the selection. The information for the viral mutations arising over time and their respective abundance over time ( = increased drug conc.) is scattered (Figure 1, Sup Figure 4) throughout the Figures. Readers might appreciate a timeline figure documenting when the first mutations arose and their genotypic abundance over time. The primer ID strategy should allow relative quantification of sequences. This is brought up to give the interesting mutational, and somewhat distinct, population pathways in some of the compound selections. Getting at the question of the history of the mutational series is informative. Perhaps a mutational tree?

Figure 1 In the case when both I84V and I50V mutational series are present for (say UMASS–2) do the mutational series occur at the same time? The abundance is for the predominant genotype, what makes up the rest of the genotypic abundance? The data in Figure 2 (and Figure S6) helps to get to this point, but perhaps a single selection figure might be more informative.

3. The authors make the point of the differences in the fitness competency of the I84V and I50V viruses. Were the end viral products of these 2 mutational pathways compared in a head–to–head replication assay in the reported study? This should be addressed.

4. The authors discuss the potential importance of accessory mutations in the I84V and I50V mutational pathway series. Perhaps I missed it, did the authors directly address the contributions of the accessory mutations to resistance, enzymatic function, and improved replication? This should be addressed.

5. The authors point out that given the reported viral fitness loss with the I50V mutation, vs I84V mutation, mutational selection should be skewed towards I50V. Based on the authors' findings UMASS compounds 3, 6–10 appear to favor I50V mutation over I84V, and the I50V accessory series selection. Yet, UMASS 2, 5, 7, and 10 appear the most potent (Table 1 and Figure S2) with resistant viruses. (1). How efficient are the accessory mutations in the I50V mutation series in restoring replication? UMASS compounds 2 and 5 have both I50V and I84V (and accessory mutations) populations, yet the compounds remain potent. Please discuss in the context of next–generation inhibitors.

6. Given the reports of Gag compensatory sites improving viral replication, is there any hint that either of the 2 pathways uses not protease mutational pathways to restore viral fitness? Please discuss.

7. The authors propose using DRV as a template to improve its efficacy. A recent report by Bulut et al. PMID: 32606378 also derives new compounds from DRV. Please discuss.

https://doi.org/10.7554/eLife.80328.sa1

Author response

Essential revisions:

For publication in eLife, the paper needs to be revised in order to improve readability (for both experts and non-experts) and to address a number of issues brought up by the Reviewers.

1) Readability.

(a) Please include in the introduction a brief "guide" to the P1', P2' positions.

This has been done in the Introduction with a new figure of the structure of darunavir, pointing out the equivalent of each position (P2P2'), the corresponding subsites, and the R1 and R2 ligand positions. Text in the figure legend addresses all of these points of nomenclature at the beginning.

(b) Please utilize the eLife system of figures and figure supplements (see published examples) to consolidate the main figures and supplementary figures.

Done.

(c) Presentation of selection data in e.g., Figures1, 3, and S4 need to be improved, for example using a mutational "tree" diagram that illustrates relationships between the primary and accessory mutations.

For Figure 1 (now Figure 2) we have added the wild type amino acid sequence at the positions relevant to each mutation. This has also been done for Figure S4. Figures 1 (now 2) and S4 are meant to show the endpoint selection and not the intermediate time points so they are otherwise unchanged.

We have added a second panel to Figure 3 (now Figure 4B) that shows the idealized timeline of mutation addition for each pathway. We have also added a supplemental figure to show a "mutation tree" for one selection that used the I84V pathway and one that used the I50V pathway. We have not done this for all of the selections individually as they sometimes fix multiple mutations at one time making the trees not very informative. We would prefer to keep the terminal culture composition and the pathways as two separate discussions as each is complicated. We have reviewed the text and tried to provide small changes to give the reviewer guidance on these points.

2) Discussion and data interpretation.

(a) The Reviewers are all in agreement that there is more to be said about the accessory variants, how these arise, and how these could contribute to resistance, beyond describing their linkages to the primary mutations. This is key to the paper, but unfortunately, the data are presented in a way that limits comprehension, even by experts. Please determine a more effective way to illustrate (perhaps something similar to an evolutionary tree) and discuss the concepts of key pathways (I84V and I50V), their establishment during selection, and the further addition of and roles of accessory mutations.

As noted above, we have provided a more accessible visual of when the accessory mutations are added in the new Figure 4B. This figure also separates the mutations that are linked to either I50V or I84V as opposed to those that are common. We now note that the linked mutations are close spatially to the primary mutation while the shared mutations are on average more distal.

We have also added text on this topic based on Dr. Schiffer's published work with several of the sequences identified in these selections (two paragraphs starting at line 689). However, a comprehensive discussion of accessory mutations is a job for a review article. These have been studied using modeling, molecular dynamics, biophysical, biochemical, genetic, and virologic approaches in a myriad of sequence combinations. We agree that understanding accessory mutations is of intrinsic interest in terms of protein function and important in terms of understanding mechanisms of resistance. Our work has clarified the relationship of some of the accessory mutations with the two specific pathways but to go beyond that in a way that would represent a rigorous discussion is beyond the scope of this publication. We appreciate the interest in this subject but we also hope everyone understands the complexity where each accessory protein may work by a different mechanism with common and specific accessory mutations and with larger or smaller effects even when shared between pathways.

(b) Address how insights from this study explain the clinical utility and limitation of DRV.

This point is now covered in the last paragraph of the text where we discuss issues of drug potency and the potential to trade enhanced potency for lower drug levels that could allow dosing in a depot format to allow a protease inhibitor to be used in a long acting formula.

(c) Going beyond DRV, discuss design strategies for minimizing drug resistance, including limits on absolute sizes of R1 and R2.

Our additional analysis, prompted by the instruction not to use "data not shown", has led to an insight we had not appreciated that addresses this question (issues that Reviewer 3 thoughtfully raised). We did not detect new/unique mutations based on the R2 group. However, we (finally) noticed that two of the R2 groups (2 and 5) both reversed the potency associated with size in the R1 group (R1-1 was more potent than R1-2) and with the R1-1 group co-selected for both primary resistance mutations (I50V and I84V). We initially thought this co-selection was a stochastic feature of the data but now see that it is linked both to the most potent inhibitors and to the reversal of R1 potency providing strong evidence of its significance even if the number of observations is low (N=3). Thus it is not the size that appears to be significant but the anchoring of R2 to the protease backbone which enhances the potency of the intermediate-sized R1-1. These points have now been added as a major conclusion in the paper and we are a bit embarrassed we did not see this connection before the first submission but are grateful to have been prompted to do so before publication. We now see that the intermediate size of R1 (bigger than DRV and smaller than R1-2) is optimal when anchoring R2 to the protease backbone. This seems like a good place to leave the interpretation of our data without further speculation.

(d) Address the potential roles of cleavage site mutations in Gag.

We have added text at the end of the Results covering this topic. There is also a new figure covering both the data and the structural interpretation (Figure 7). This analysis had been done but had been cut from the submitted manuscript to save length.

Reviewer #1 (Recommendations for the authors):

My comment would be to provide some broader set of rules through which resistance develops, starting with the WT, beyond the 184V vs 150V variants themselves and the pathway switch mediated by modifications at the P1' region, in particular for the accessory mutations. The authors have generated an extensive amount of data from the passaging experiments, using multiple different drugs. Is it possible to devise some conglomerate ordering for resistance onset? Or place the accessory mutations into a broader context? For example, looking at the data in Figure 2, it appears that V82I and M46I somewhat consistently emerge in the initial or mid-stages of the passaging experiments, and they're almost universally present in Figure 1. I47V and M71V, which are also nearly universally present in Figure 1, appear to be more variable in Figure 2, occasionally arising toward the end of the passaging experiment. Are there any other patterns that emerge?

The reviewer has raised an important point and one we want very much to convey. Since we apparently did not do a good job of that we are thankful to the reviewer for raising this issue. We had designed Figure 3 (now Figure 4A) to show this point since it shows both the timing of appearance and the frequency with which each mutation appears when normalized by drug concentration across the different selections; clearly this was not an effective way to convey this information. We have retained this figure since it represents the interpretation of the primary data but we now reproduce this information in two ways. First, there is now a Figure 4B where we present this same information as a timeline with the mutations shown as sequentially entering with arrows, with the size of the arrow representing the frequency with which this happens among the different selections. Also, the figure is drawn to easily distinguish the common versus the unique mutations in the two pathways. Second, in the new supplemental figure linked to Figure 4 we show a phylogenetic tree following the accumulations of mutations in a culture that followed each pathway. While we don't feel this is the most accessible way to present the data we have included it since this is an analytical approach that is familiar to most readers.

Reviewer #2 (Recommendations for the authors):

– The authors report an extended analysis of the development of resistance mutations in HIV protease sequence variation during long-term treatment with darunavir. Long–term drug therapy may also select for mutations in cleavage sites in gag/pol where protease cleaves. Have the authors investigated the presence of cleavage site mutations in the gag sequence using single genome amplification (SGA) or the like? Are all of the changes responsible for resistance attributable to changes in protease only?

We had minimized our data on this point in the initial submission given the length requirements. We now present our data on cleavage sites at the end of the Results (Figure 7). Because we did so many selections we have robust genetic data on this point and are able to link protease mutations to cleavage site mutations with models of structures to interpret the results. In addition, we recently "decoded" the sequence preferences of protease cleavage sites and can also interpret both the wild type and mutant sequences in that context.

– In Figure 1A, the authors use NGS with primer ID to quantify the abundance of drug-resistant variants. These studies are well done and represent the gold standard for measuring levels of different variants. As they show, the highly resistant variants may predominate but other variants are present as well, in some cases seems like these other variants are at substantial levels, and additional discussion of these data would be useful. For instance, in studies of UMASS–2, the most abundant variant is only 37% of the total; this observation seems consistent with the data in figure 2A, where the authors find relatively high Shannon entropy for UMASS-2, reflecting considerable genetic diversity. In contrast, in studies of UMASS-8, where the most abundant variant is also low (only 55.4%), figure 2A shows a relatively low shannon entropy. Were there other closely related variants without DRMS that would explain both the relatively low abundance of resistant variants, but also explain the low entropy levels? Additionally, in these long–term experiments, were there any WT variants present after prolonged drug exposure?

We appreciate that the reviewer delved into the large amount of data associated with these selections to bring out this point. First, we did not detect any wild type sequence at the end of the selections. Second, there is no simple "this is it" answer to questions about all specific details of the selections. Each selection is different with what are probably some stochastic bottlenecks that are carried forward. In terms of the differences in Shannon Entropy that likely resides in the fact that the top two variants in the UMASS-2 selection (starting with wild type) add to only 71% of the population while for the UMASS-8 selection (starting with wild type) the top two variants account for 97% of the population. It is possible that some of the variability at the end could have been further reduced by several more passages at the highest inhibitor concentration. We assume that the presence of several variants at significant levels represent variants with similar fitness and resistance levels such that longer passage in the presence of drug is needed to allow one of the otherwise similar variants to outgrow the other. The minor variants always have the core mutations that define the pathway and differ by minor mutations that are probably present for a variety of reasons. While some of these minor mutations can sometimes also be present as fixed mutations (such as I13V), other mutations may be deleterious such as the R87K mutation within the highly conserved GRN α helix motif; this mutation likely occurs through APOBEC3G mutagenesis and is carried in the population at a very low level but contributes to low level variability. It is difficult to discuss the data at this level within the text so we are happy that an especially interested reader will be able to read about these points in the published review/response format.

Elsewhere, the authors suggest the presence of widespread recombination in these cultures – might recombination also explain some of the variations in Figure 1A? Considering several log decreases in infectivity demonstrated in Figure 1 D, is recombination contributing to ongoing transmission during drug exposure?

We are unclear as to what the reviewer is asking with this question. We assume recombination is always available to the virus under our culture conditions where all cells are ultimately killed, giving the opportunity for dual infection and packaging of heterodimeric RNA (the precursor to recombination during the subsequent infection). Although we have not measured recombination, we assume it is one of the mechanisms at work in generating genetic diversity on which the drug selective pressure is working.

– Figure 2A – Regarding the Shannon entropy determinations – were parallel studies done with darunavir? if so, how do those shannon entropy profiles compare with those measured for UMASS compounds and can they be included in the figure?

The same deep sequencing was done for the DRV selections so the same information on Shannon entropy is available. This information is now included a supplemental figure.

– The authors have important new insights in their structural studies – can they place boundaries on the absolute sizes of R1 or R2 for future drug design?

We appreciate this comment as it caused us to look again at our data with an eye toward "what's next." However, in doing this we noted that two of the R2 groups reversed the order of potency of the R1 groups and changed the selection pathway to co-select both primary resistance mutations. Thus the anchoring of the R2 group to the protease backbone appears to create an optimal intermediate-sized R1 group. Thus size is not the issue but specific interactions. Since R1-1 and R2-2 appear to interact in an unexpected way it is hard to speculate on how to further improve this outcome.

– Previously these authors have provided useful models of genetic variation in vitro (Boucher et al., 2019), including estimating selection coefficients. Can the authors discuss their new data in light of these modeling studies and identify selection coefficients for these variants?

We appreciate the reviewer challenging us to think more broadly about this question. However, in trying to think through this specific example it seems problematic. In Boucher et al., the metric of fitness was always relative to wild type. In our studies we are continuously changing target fitness sequence with increasing drug concentration. Furthermore, it is likely that we could passage virus forever at 100 nM drug concentration and the virus would settle in a pattern of sequence changes that would be different (i.e. fewer mutations) than present in virus taken to 1 uM. We capture the root of this idea in a qualitative way in the new Figure 4B but the approach that we used in virus passage and the variable way we raised drug concentrations in the many different cultures precludes a meaningful mathematical analysis.

– Do the data from these studies provide insight to explain the clinical utility and limitations of Darunavir as an initial or simplification therapy? For instance:

a) the two–drug combination of Darunavir and Raltegravir is highly effective in suppressing HIV, but this combination is not recommended in individuals with HIV viral loads exceeding 100,000 copies/ml plasma, because of the increased risk for the development of resistance.

b) individuals with baseline protease resistance simplifying their regimens to darunavir+raltegravir did not result in increased frequency of protease resistance (Vizcarra et al., 2019).

Our work does not speak directly to these points. There has long been an unexplained relationship between reduced drug efficacy and high viral load. It seems likely that this has to do with the population size of the virus in a person, with the larger populations having a higher probability of generating resistance. Therapy failure without resistance is a complex issue. Often this is the result of poor adherence. However, resistance to one of the drugs in a regimen may allow breakthrough replication in the face of the other drug without resistance (at least initially). Our poor understanding of these issues is an obstacle in interpreting data from drug simplification clinical studies.

– Can you clarify the tinting in Figure S2?

As written:

"UMASS-1, -3, -6, -8 show less potency towards these variants and is shaded in a red tint matching R2 groups. Are the tints supposed to be the same for R2 groups?"

But in my version:

UMASS 1 and 3 have an identical tint, but different R2 groups, and

UMASS 6, and 8 have identical tint but different R2 groups.

In table 1

UMASS-1 and -6 have identical R2 groups

UMASS-3 and -8 have identical R2 groups

also

UMASS-2 and 5 have identical tint but different R2 groups

UMASS-7 and 10 have identical tint, but different R2 groups

UMSS 10 and 5 have identical R2 groups

UMASS 2 and 7 have identical R2 groups

It's not clear whether this could be an ADOBE issue or something else is meant by this text

and figure.

Sorry, we were not very coherent in explaining this figure. The red tinted data points/inhibitors behave differently from the green tinted ones. They come in pairs based on the R1 group (-1 and -6) etc., but two pairs are tinted red and two pairs green because the pairs act similarly by color. We have tried to clarify this language.

–Line 474 "demonstrate" should be "demonstrated".

Corrected.

Reviewer #3 (Recommendations for the authors):

The manuscript is comprehensive, well–written, and informative, yet dense and with some figures that readers may not find informative. Comments are discussed below.

1. Perhaps it was just me, but I found that figure layout and organization difficult to get the necessary take–home points. In table 1, it might be helpful for the reader to have viral EC50s for the initial mutations, as shown, and the endstage predominates viral populations as well. I am aware that in some cases the final population is a viral mixture of two genotypes and this should be noted.

We have added the endpoint EC50s to Table 1 for those virus pools where it was tested.

2. Given that a major focus of the report is mutational selections starting from wildtype from DRV and UMASS compound series to the ending of the selection. The information for the viral mutations arising over time and their respective abundance over time ( = increased drug conc.) is scattered (Figure 1, Sup Figure 4) throughout the Figures. Readers might appreciate a timeline figure documenting when the first mutations arose and their genotypic abundance over time. The primer ID strategy should allow relative quantification of sequences. This is brought up to give the interesting mutational, and somewhat distinct, population pathways in some of the compound selections. Getting at the question of the history of the mutational series is informative. Perhaps a mutational tree?

As noted above we have now presented this information in two ways:

As an idealized selection timeline in Figure 3b (which we feel is more accessible) and as a "mutational tree" in a supplemental figure for a typical selection.

Figure 1 In the case when both I84V and I50V mutational series are present for (say UMASS–2) do the mutational series occur at the same time? The abundance is for the predominant genotype, what makes up the rest of the genotypic abundance? The data in Figure 2 (and Figure S6) helps to get to this point, but perhaps a single selection figure might be more informative.

The reviewer gets at an interesting point although the sample size for making conclusions is getting too small to speak with a lot of confidence. In the entropy figure (now Figure 3) it appears there are two answers to this question. When we started the culture with many mutants then I84V/V32I come in early and I50V comes in late (UMASS-2). However, when we start with wild type then I84V and I50V come in first together with I47V coming in before V32I, suggesting the I50V is driving the selection for the linked mutation (UMASS-2 and -5). While this is an interesting nuance in the data (and it is present for someone patient enough to go look for it), we feel it is based on too few selections to be able to make a strong conclusion.

3. The authors make the point of the differences in the fitness competency of the I84V and I50V viruses. Were the end viral products of these 2 mutational pathways compared in a head–to–head replication assay in the reported study? This should be addressed.

We have not done a head-to-head comparison. As noted, we have virus pools, not clones. While these pools are sufficient for getting an EC50 in short term replication, assessing fitness requires multiple rounds of replication and should be done with a clone. Realistically, this is another study to tear apart these pathways for the effect of individual mutations and differences in the pathways.

4. The authors discuss the potential importance of accessory mutations in the I84V and I50V mutational pathway series. Perhaps I missed it, did the authors directly address the contributions of the accessory mutations to resistance, enzymatic function, and improved replication? This should be addressed.

We have noted that to the extent there are mutations that are just present for "fitness" they do not restore fitness but rather attenuate the loss of fitness due to resistance (as seen in now Figure 2E). There is a large body of literature on accessory/compensatory mutations, not all of which is consistent and that has been generated using a number of not easily compared experimental approaches. As shown in Figure 4B we can now start to order their appearance and linkage. Dr. Schiffer has started to look at these mutations in biochemical, structural, and molecular dynamics studies and we now more clearly reference that work. However, it is beyond the scope of this study to add further clarity into the roles of these extra mutations.

5. The authors point out that given the reported viral fitness loss with the I50V mutation, vs I84V mutation, mutational selection should be skewed towards I50V. Based on the authors' findings UMASS compounds 3, 6–10 appear to favor I50V mutation over I84V, and the I50V accessory series selection. Yet, UMASS 2, 5, 7, and 10 appear the most potent (Table 1 and Figure S2) with resistant viruses. (1). How efficient are the accessory mutations in the I50V mutation series in restoring replication? UMASS compounds 2 and 5 have both I50V and I84V (and accessory mutations) populations, yet the compounds remain potent. Please discuss in the context of next–generation inhibitors.

The reviewer correctly identifies a feature of our data we had not adequately considered. We have added extensive discussion of this point as we believe UMASS-2 and -5 are more potent due to their R2 group and even reversing the size advantage of R1-2 over R1-1. This enhanced interpretation of our data is now an important part of the paper.

6. Given the reports of Gag compensatory sites improving viral replication, is there any hint that either of the 2 pathways uses not protease mutational pathways to restore viral fitness? Please discuss.

Yes, we had that data but left it out for length considerations. It is now included at the end of the Results section.

7. The authors propose using DRV as a template to improve its efficacy. A recent report by Bulut et al. PMID: 32606378 also derives new compounds from DRV. Please discuss.

We have included a discussion of this (and related) work near the end of the Discussion. It was an oversight on our part not to have included these relevant references.

https://doi.org/10.7554/eLife.80328.sa2

Article and author information

Author details

  1. Ean Spielvogel

    1. Department of Microbiology and Immunology, University of North Carolina at Chapel Hill, Chapel Hill, United States
    2. Lineberger Comprehensive Cancer Center, University of North Carolina at Chapel Hill, Chapel Hill, United States
    Contribution
    Data curation, Formal analysis, Validation, Investigation, Methodology, Writing - original draft
    Competing interests
    No competing interests declared
  2. Sook-Kyung Lee

    Lineberger Comprehensive Cancer Center, University of North Carolina at Chapel Hill, Chapel Hill, United States
    Contribution
    Data curation, Formal analysis, Validation, Investigation, Methodology, Writing – review and editing
    Competing interests
    No competing interests declared
  3. Shuntai Zhou

    Lineberger Comprehensive Cancer Center, University of North Carolina at Chapel Hill, Chapel Hill, United States
    Contribution
    Data curation, Formal analysis, Validation, Investigation
    Competing interests
    No competing interests declared
  4. Gordon J Lockbaum

    Department of Biochemistry and Molecular Biotechnology, University of Massachusetts Medical School, Worcester, United States
    Present address
    Accent Therapeutics, Lexington, United States
    Contribution
    Data curation, Formal analysis, Validation, Investigation, Methodology, Writing – review and editing
    Competing interests
    No competing interests declared
  5. Mina Henes

    Department of Biochemistry and Molecular Biotechnology, University of Massachusetts Medical School, Worcester, United States
    Present address
    Emory University School of Medicine, Atlanta, United States
    Contribution
    Data curation, Validation, Investigation, Methodology
    Competing interests
    No competing interests declared
  6. Amy Sondgeroth

    1. Department of Microbiology and Immunology, University of North Carolina at Chapel Hill, Chapel Hill, United States
    2. Lineberger Comprehensive Cancer Center, University of North Carolina at Chapel Hill, Chapel Hill, United States
    Contribution
    Data curation, Validation, Investigation, Methodology, Writing – review and editing
    Competing interests
    No competing interests declared
  7. Klajdi Kosovrasti

    Department of Biochemistry and Molecular Biotechnology, University of Massachusetts Medical School, Worcester, United States
    Contribution
    Data curation, Formal analysis, Validation, Investigation, Writing – review and editing
    Competing interests
    No competing interests declared
  8. Ellen A Nalivaika

    Department of Biochemistry and Molecular Biotechnology, University of Massachusetts Medical School, Worcester, United States
    Contribution
    Resources, Data curation, Formal analysis, Investigation, Visualization, Methodology
    Competing interests
    No competing interests declared
  9. Akbar Ali

    Department of Biochemistry and Molecular Biotechnology, University of Massachusetts Medical School, Worcester, United States
    Contribution
    Resources, Data curation, Formal analysis, Supervision, Validation, Investigation, Visualization, Methodology, Writing – review and editing
    Competing interests
    No competing interests declared
  10. Nese Kurt Yilmaz

    Department of Biochemistry and Molecular Biotechnology, University of Massachusetts Medical School, Worcester, United States
    Contribution
    Resources, Data curation, Formal analysis, Supervision, Validation, Investigation, Visualization, Methodology, Writing – review and editing
    Competing interests
    No competing interests declared
  11. Celia A Schiffer

    Department of Biochemistry and Molecular Biotechnology, University of Massachusetts Medical School, Worcester, United States
    Contribution
    Conceptualization, Resources, Formal analysis, Supervision, Funding acquisition, Writing - original draft, Project administration
    For correspondence
    celia.schiffer@umassmed.edu
    Competing interests
    No competing interests declared
    ORCID icon "This ORCID iD identifies the author of this article:" 0000-0003-2270-6613
  12. Ronald Swanstrom

    1. Lineberger Comprehensive Cancer Center, University of North Carolina at Chapel Hill, Chapel Hill, United States
    2. Department of Biochemistry and Biophysics, University of North Carolina at Chapel Hill, Chapel Hill, United States
    Contribution
    Conceptualization, Formal analysis, Supervision, Funding acquisition, Writing - original draft, Project administration, Writing – review and editing
    For correspondence
    ron_swanstrom@med.unc.edu
    Competing interests
    No competing interests declared
    ORCID icon "This ORCID iD identifies the author of this article:" 0000-0001-7777-0773

Funding

National Institute of General Medical Sciences (1P01GM109767-01A`)

  • Ean Spielvogel
  • Celia A Schiffer
  • Gordon J Lockbaum
  • Shuntai Zhou
  • Mina Henes
  • Akbar Ali
  • Ellen A Nalivaika
  • Klajdi Kosovrasti
  • Nese Kurt Yilmaz
  • Ronald Swanstrom
  • Amy Sondgeroth
  • Sook-Kyung Lee

National Institute of General Medical Sciences (R01-GM135919)

  • Ean Spielvogel
  • Celia A Schiffer
  • Gordon J Lockbaum
  • Shuntai Zhou
  • Mina Henes
  • Akbar Ali
  • Ellen A Nalivaika
  • Klajdi Kosovrasti
  • Nese Kurt Yilmaz
  • Ronald Swanstrom
  • Amy Sondgeroth
  • Sook-Kyung Lee

NIAID (R01-AI140970)

  • Ean Spielvogel
  • Celia A Schiffer
  • Gordon J Lockbaum
  • Shuntai Zhou
  • Mina Henes
  • Akbar Ali
  • Ellen A Nalivaika
  • Klajdi Kosovrasti
  • Nese Kurt Yilmaz
  • Ronald Swanstrom
  • Amy Sondgeroth
  • Sook-Kyung Lee

UNC CFAR (P30-AI050410)

  • Ean Spielvogel
  • Celia A Schiffer
  • Gordon J Lockbaum
  • Shuntai Zhou
  • Mina Henes
  • Akbar Ali
  • Ellen A Nalivaika
  • Klajdi Kosovrasti
  • Nese Kurt Yilmaz
  • Ronald Swanstrom
  • Amy Sondgeroth
  • Sook-Kyung Lee

UNC Lineberger Comprehensive Cancer Center (P30-CA016086)

  • Ean Spielvogel
  • Celia A Schiffer
  • Gordon J Lockbaum
  • Shuntai Zhou
  • Mina Henes
  • Akbar Ali
  • Ellen A Nalivaika
  • Klajdi Kosovrasti
  • Nese Kurt Yilmaz
  • Ronald Swanstrom
  • Amy Sondgeroth
  • Sook-Kyung Lee

The funders had no role in study design, data collection and interpretation, or the decision to submit the work for publication.

Acknowledgements

This work was supported by grants from NIGMS (P01-GM109767, R01-GM135919) and NIAID (R01-AI140970) of the NIH. This research received infrastructure support from the UNC CFAR (P30-AI050410), and the UNC Lineberger Comprehensive Cancer Center (P30-CA016086). We also acknowledge receipt of reagents from the NIH HIV Reagent Program: TZM-bl cells (J Kappes and X Wu, contributors) and CEMx174 cells (P Cresswell, contributor). The support of the UNC High Throughput Sequencing Facility is also acknowledged.

Senior Editor

  1. Sara L Sawyer, University of Colorado Boulder, United States

Reviewing Editor

  1. Owen Pornillos, University of Virginia, United States

Reviewer

  1. Dmitry Lyumkis, Salk Institute for Biological Studies, United States

Version history

  1. Preprint posted: November 9, 2019 (view preprint)
  2. Received: May 16, 2022
  3. Accepted: March 14, 2023
  4. Accepted Manuscript published: March 15, 2023 (version 1)
  5. Version of Record published: March 31, 2023 (version 2)
  6. Version of Record updated: April 5, 2023 (version 3)

Copyright

© 2023, Spielvogel et al.

This article is distributed under the terms of the Creative Commons Attribution License, which permits unrestricted use and redistribution provided that the original author and source are credited.

Metrics

  • 693
    Page views
  • 98
    Downloads
  • 5
    Citations

Article citation count generated by polling the highest count across the following sources: Crossref, PubMed Central, Scopus.

Download links

A two-part list of links to download the article, or parts of the article, in various formats.

Downloads (link to download the article as PDF)

Open citations (links to open the citations from this article in various online reference manager services)

Cite this article (links to download the citations from this article in formats compatible with various reference manager tools)

  1. Ean Spielvogel
  2. Sook-Kyung Lee
  3. Shuntai Zhou
  4. Gordon J Lockbaum
  5. Mina Henes
  6. Amy Sondgeroth
  7. Klajdi Kosovrasti
  8. Ellen A Nalivaika
  9. Akbar Ali
  10. Nese Kurt Yilmaz
  11. Celia A Schiffer
  12. Ronald Swanstrom
(2023)
Selection of HIV-1 for resistance to fifth-generation protease inhibitors reveals two independent pathways to high-level resistance
eLife 12:e80328.
https://doi.org/10.7554/eLife.80328

Share this article

https://doi.org/10.7554/eLife.80328