Introduction

Both the activity and mobility of proteins within the crowded cellular environment are profoundly influenced by interactions with their surroundings13. Under these conditions, where diffusion is no longer well described by the Stokes radius of the protein monomer4, changes in protein motion might be expected to correlate closely with changes in the activity of these proteins. The development of increasingly sophisticated live-cell microscopy techniques, including early ensemble methods like fluorescence recovery after photobleaching (FRAP) and fluctuation correlation spectroscopy (FCS), have informed our understanding of protein dynamics in cellular biology5. A myriad of technical improvements, such as enhanced labeling methods6,7, better live-cell compatible fluorophores8, new forms of light microscopy9, dramatic increases in computational power10, and the addition of machine learning approaches to data analysis10,11 have together enabled a new era of imaging-based studies across biological contexts3,5,12.

Applying any microscopy technique at scale presents challenges, but recent advances have shown the power of high content imaging techniques to address both mechanistic biological questions as well as to generate leads for new chemical matter in drug discovery13. Even these conceptually simple experiments involving fixed cells stained with well-characterized commercial reagents take careful experiment design and sophisticated computational approaches to execute14. It is no wonder, then, that attempts to combine high content imaging workflows with more advanced super-resolution microscopy methods have thus far been limited. Such advances have enabled the development of systems for fixed-cell STORM imaging at an impressive 10,000 cells per day15, though the appropriate application of this increase in scale remains an open question.

In single molecule tracking (SMT), a fluorescently labeled protein of interest is imaged at high spatiotemporal resolution to track its motion in a live cell10. The information embedded in these trajectories has been used to investigate diverse cellular phenomena including protein oligomerization state and function1517, inter-organelle communication18, nuclear organization19, and transcription regulation 2023. Of particular utility are “fast-SMT” approaches which use high frame rates and stroboscopic illumination to minimize motion-induced blurring, and hence can measure diffusive states over a large dynamic range24,25. Specifically, proteins that diffuse rapidly throughout the cell are often missed in alternative tracking approaches, biasing the resulting data. In spite of the potential biological discoveries that depend on the application of SMT on a large scale, SMT in general (and fast-SMT in particular) has not been adapted to a high-throughput setting that would enable the analysis of complex, multi-component systems, or the identification of compounds that affect protein motion.

Steroid hormone receptors (SHRs) are a class of transcription factors that play crucial roles in normal human development and in disease pathogenesis. SHRs like the estrogen receptor (genes ESR1 and ESR2), androgen receptor (AR), and progesterone receptor (PR), as examples, contribute decisively to the acquisition of secondary sex characteristics, while the glucocorticoid receptor (GR) helps to orchestrate both metabolism and inflammation26. In their ligand-free state, SHRs are kept sequestered in multiprotein complexes by the chaperone HSP9027. Canonically, in the presence of hormone they dimerize and bind their cognate genomic response elements, recruiting epigenetic modifiers and transcription machinery2628. At the same time, steroid hormone receptor-derived signals impose a large disease burden by promoting the growth of breast cancers (ER)26,29 or prostate cancers (AR)26; or by imposing immune and metabolic dysfunction (GR)26,30. SHRs therefore provide an excellent proof-of-concept system to study the relationship between protein dynamics and protein function due to the wealth of information and reagents29 already available for these systems as well as previous reports characterizing some aspects of their cellular dynamics 20,22,3133.

Here, we present the first industrial-scale, high-throughput, fast-SMT (htSMT) platform capable of measuring protein motion from more than 13,000 individual assay wells (>1,000,000 individual cells) per day. Using ER as a test system, we demonstrate that chemical screening using htSMT is specific, robust, and reproducible. The increase in throughput enables classical drug discovery activities, including compound library screening and the elucidation of structure-activity relationships (SAR), yielding accurate and reproducible results that are inaccessible or unmeasurable with other techniques or using SMT on a smaller scale. Importantly, we demonstrate that htSMT can be used to characterize both known and novel pathway contributions to the ER protein interaction network. More than a proof-of-concept for the htSMT platform, these data confirm that analysis of protein motion itself on a large scale reveals detailed information about pathway interactions and signaling.

Results

Creation and validation of a high throughput SMT platform

We developed a robotic system capable of handling reagents, collecting high-quality fast SMT image series, processing time-ordered raw images to yield molecular trajectories, and extracting features of biological interest within defined cellular compartments (Figure 1 – figure supplement 1A, B). Samples start as cells seeded into 384-well plates in a hotel incubator. A central robotic arm retrieves the plates, and delivers them to an Echo 650 acoustic dispenser to add dye. After incubating, excess dye is washed away and Echo 650 is again used to administer compound treatment. Stained and compound-treated plates are then delivered to any of up to four identical SMT microscopes for imaging. Both SMT and accompanying Hoechst images are collected and automatically processed to identify individual molecule positions, reconnect the spot coordinates into trajectories, and then associate each trajectory with a nuclear mask. Finally the processed SMT data are subjected to quality control to omit aberrant fields of view using a convolutional neural network trained to identify technical errors in the images (Figure 1 – figure supplement 1C), and finally stored for downstream analysis.

To examine htSMT system performance across a broad spectrum of diffusion coefficients, we generated three U2OS cell lines ectopically expressing HaloTag6 fused proteins with well-established behaviors in the cell. These HaloTag fusions allow the subsequent addition of bright and photostable organic fluorophores like JF5498 which produce high signal spots to detect and track. Histone H2B-Halo, which is predominantly incorporated into chromatin and therefore effectively immobile over short timescales25, was employed to estimate localization error. A prenylation motif (Halo-CaaX) embedded in the plasma membrane exhibits moderate diffusion34. Unfused HaloTag was chosen to represent the upper limit of cellular “free” diffusion. Single-molecule trajectories measured in these cell lines yielded diffusion coefficients for CaaX similar to published results34, and the diffusion for H2B-HaloTag was consistent with the theoretical lower bounds that can be approximated from the localization error and 10 msec frame interval (Figure 1A). Localization error can be measured directly from the single molecule trajectories using the jump covariance of slow or immobile particles35. Using the immobile H2B-Halo trajectories, we found the localization error of the htSMT system to be 39 nm (Figure 1 – figure supplement 2A), comparable to other benchmark stroboscopic illumination datasets25,35. The diffusion coefficient for free HaloTag is consistent with previous SMT reports25, but is also within the theoretical upper bounds of an MSD estimator of diffusion coefficient for a 10 msec frame interval and 1.25 µm search radius , thus we consider the distribution recovered from the free HaloTag to represent the upper limit of trackable particles with this assay configuration.

Benchmarking a high throughput single molecule tracking platform

a) Diffusion state probability distributions from three cell lines expressing Histone H2B-Halo, Halo-CaaX, or Halo alone. Shaded bins represent the diffusive states characteristic of each cell line.

b) Example field of view. Equal mixture of H2B-Halo, Halo-CaaX and free Halo cell line single molecule images (top) and reference Hoechst image (bottom). Insets show zoom-ins to individual cells and to sequential frames of individual molecules. Image intensities are equivalently scaled across panels.

c) Single-cell diffusive states extracted from©), colored based on similarity to the H2B-, CaaX-, or free-Halo dynamics reported in (b).

d) Heatmap representation of 103757 cell nuclei measured from a mixture of Halo-H2B, Halo-CaaX, or Free Halo mixed within each well over 1540 unique wells in five 384-well plates. Each horizontal line represents a nucleus. Cells were clustered using k-means clustering and labels assigned based on the diffusive profiles determined in (b).

e) Ensemble state occupation of all trajectories recovered from a mixture of Halo-H2B, Halo-CaaX, free Halo cells. Mean state occupation from 308 assay wells.

We then tested whether our htSMT platform can extract accurate molecular trajectories at scale. We employed 384-well plates where free Halo, Halo-CaaX, and H2B-Halo cell lines were mixed in equal proportions in each well. Imaging with a 94 µm by 94 µm field-of-view (FOV), we achieved an average of 10 nuclei simultaneously (Figure 1B, Figure 1 – figure supplement 2B, Supplemental video 1), enough that most FOVs contained cells from each cell line. To limit ambiguity in cell assignment, we considered only the trajectories that fell within nuclear segmented regions. The probability distribution of diffusion states cleanly distinguishes between the three cell types (Figure 1C). More importantly, by looking at the single-cell state distribution profiles of 103,757 cells from five separate 384-well plates, grouped by their distribution profile, we recovered highly consistent estimates of protein dynamics at the single cell level, comparable to the pure populations (Figure 1D, Figure 1 – figure supplement 2C).

While single-cell measurements are powerful, the number of trajectories in one cell is limited, and so estimates of diffusive states can be broad. Combining trajectories from multiple cells, however, provides the expected distribution of diffusive states (Figure 1E). Moreover, combining trajectories derived from many cells makes model-based25 or model-agnostic35 state analysis possible, where as few as 103 trajectories permit satisfactory inference of the underlying diffusion states. We determined that imaging six fields of view (FOV), for 1.5 seconds each, yielded enough trajectories (> 10,000) to accurately estimate protein dynamics, bringing the overall throughput of the platform to more than 13,000 individual wells (>90,000 FOVs; > 1,000,000 cells/day), a rate of data acquisition that enables compound screening on a feasible timescale (Figure 1 – figure supplement 2D).

Using htSMT to measure protein dynamics of SHRs

Equipped with an htSMT system capable of measuring protein dynamics broadly, we sought to understand how measuring protein motion can be used to characterize protein activity. SHRs transition between inactive and active states via ligand binding (Figure 2 – figure supplement 2A), a phenomenon that has been previously observed at the single molecule level20,22, and we hypothesized that the large dynamic range and orders-of-magnitude increase in throughput of our platform could capture these differences in the context of compound screening. We generated HaloTag fusions to ER, AR, and PR through ectopic expression, and GR through endogenous knock-in. Similar to previous approaches20,36, we used clonal cell lines in a U2OS cell background to minimize effects of comparing dynamics in different cell types. Clones were carefully selected such that the HaloTag fusion SHRs were comparable to each other in transcript abundance, and not higher than transcript levels in tissue-specific cell lines like MCF7 and T47d, which are both ER and PR positive (Figure 2 – figure supplement 2B).

In the absence of hormone, all four SHR proteins exhibit similar dynamic profiles: a small immobile fraction and a large freely diffusing fraction with a 3.4 – 4.3 µm2/sec average diffusion coefficient (Figure 2A, Figure 2 – figure supplement 2C). No correlation between diffusion and protein molecular weight (138 kDa for Halo-AR, 102 kDa for ER-Halo, 122 kDa for Halo-GR, and 135 kDa for Halo-PR) was observed, highlighting the differences between cellular protein dynamics versus purified systems. Upon addition of agonist, a dramatic increase in immobile trajectories is observed, which we attribute to chromatin binding. Using a conservative upper-bound of chromatin mobility in the nucleus and chromatin-associated transcription factors35, we define the bound fraction (fbound) for each SHR as the fraction of trajectories diffusing less than 0.1 µm2/sec (Figure 2A). Using this threshold, fbound of histone H2B is 0.92 on average, consistent with previous reports25. Dfree we defined as the occupation-weighted average diffusion coefficient of the non-bound states (Figure 2 – figure supplement 2C). Some SHRs had a higher proportion of bound molecules than others; The ligand-induced effect is most pronounced for ER, with 34% bound in basal conditions and 87% bound after estradiol treatment (Figure 2A, Supplemental video 2 and 3).

Chromatin binding by steroid hormone receptors is affected by compound treatment.

a) Distribution of diffusive states for Halo-AR, Halo-ER, Halo-GR, and Halo-PR in U2OS cells before and after stimulation with an activating ligand. The area in the shaded region is fbound. Shaded error bands represent S.D.

b) Selectivity of individual SHRs to their cognate ligand compared with other steroids, as determined by fbound. Error bars represent SEM across three biological replicates.

SHRs are highly selective for their cognate agonists in biochemical binding assays, which we confirmed by measuring the dose-dependent change in dynamics as a function of agonist concentration. The maximal change in fbound (Figure 2B) and decrease in Dfree (Figure 2 – figure supplement 2D) differed between SHRs. The dose titration curves also showed variable potencies (EC50) for each SHR/hormone pair, with ER-estradiol being both the most potent and most selective pair. RNA-seq after estradiol stimulation showed a marked induction of hallmark ER-dependent gene sets37, confirming that the increase in chromatin binding we observed by SMT has a functional effect in promoting ER-responsive gene programs, even in the ectopic expression setting (Figure 2 – figure supplement 2E, F). Thus, SMT can detect functionally-relevant changes in transcription factor dynamics and accurately differentiate the ligand/target specificity directly within the cellular environment.

Screening a diverse bioactive chemical set identifies known and novel modulators of ER dynamics

Our characterization efforts of ligand selectivity for AR, ER, GR, and PR collectively suggested that we could use SMT to interrogate the effects of compounds on protein dynamics at a throughput conducive to high throughput screening. We first identified a structurally diverse set of 5,067 molecules with heterogeneous biological activities as a useful screening set. Having determined that the same data acquisition parameters (6 FOVs imaged for 1.5 sec each) were sufficient to recover more than 10,000 trajectories per well, we could achieve a throughput of 15,000 wells per day which would permit us to interrogate the whole library in a single day (Figure 3 – figure supplement 1A). For cells at steady state, such a brief acquisition time also means that the dynamical state of the cell remains largely constant within an FOV (Figure 3 – figure supplement 1B), allowing all of the trajectories from an FOV to be considered largely representative of the same cellular state. We chose to screen this bioactive compound set against ER, assessing change in fbound at 1 µM compound versus the vehicle DMSO (Figure 3, Figure 3 – figure supplement 1C, D, Figure 3 – supplemental table 1). The screen was run twice to assess reproducibility, showing a high degree of agreement between replicates for ER-active molecules (Figure 3 – figure supplement 2A). This screen illustrates some important advantages of our htSMT platform over more manual lower-throughput approaches.

Bioactive screen results of change in estrogen receptor fbound for two biological replicates, each with 2 well replicates per compound of 5067 compounds. Select inhibitors are grouped and uniquely colored by pathway or target. Error bars are SEM of all replicates.

From plate to plate, the assay window for the screen was robust38 (average Z’-factor = 0.79 over 72 plates), the measured potency of the control estradiol in each instance remained within three-fold of the mean (Figure 3 – figure supplement 2B-D), and the distribution of negative control wells centered tightly on zero (Figure 3 – figure supplement 2E, F). Each compound measurement was averaged from SMT trajectories of between 94 and 161 cells (25th to 75th percentiles; Figure 3 – figure supplement 2G). Of the 30 compounds we identified from the bioactive set that we expected to modulate ER, either as agonists or antagonists29,39, all significantly increased fbound measured by SMT. This includes agonistic molecules like estradiol, but also notable examples such as 4-hydroxytamoxifen, fulvestrant, and bazedoxifene (Figure 3, Extended Data 3B, H).

With a large dataset to work from, collected across multiple assay plates on multiple independent microscopes, we examined the sources of variability within the screening assay. Using random sampling of individual jumps within the screening dataset while holding constant the source of the jumps (for example, sampling jumps within a specific assay well), we could estimate the relative contributions of microscope-to-microscope, plate-to-plate, well-to-well, FOV-to-FOV and cell-to-cell variability. Cell-to-cell variability was the single largest contributor to overall assay variability, especially when considering only vehicle treated controls (Figure 3 – figure supplement 3B), and stabilized after ∼1000 jumps (Figure 3 – figure supplement 3C). These results support the use of a short acquisition time per FOV and multiple FOVs to stabilize the dynamical state estimate.

The somewhat counter-intuitive finding that either strong agonism or antagonism can lead to an increase in chromatin binding has been reported for ER31, but this appears not to a general feature of SHRs. While the PR antagonist mifepristone40 behaves similarly to ER (Figure 3 – figure supplement 4A), antagonists of AR like Enzalutamide and Darolutamide41, and antagonists of GR like AL082D0642 cause a decrease in chromatin binding. This decrease occurs when administered singly or when co-administered in competition with the cognate agonist (Figure 3 – figure supplement 4B, C). These results show how the cellular context and interaction partners are critical to understand the effect of a compound on its intended target. To underscore this point, in addition to binders of the ER ligand-binding domain, we also identified a number of active compounds targeting diverse nodes in the ER interaction network, including modulators of the proteasome, chaperones, kinases, and others (Figure 3).

Cellular ER dynamics elucidate structure-activity relationships (SAR) of ER modulators

Our screen revealed that, surprisingly, all the known ER modulators—both agonists like estradiol and potent antagonists like fulvestrant—caused an increase in fbound (Figure 4A). We therefore characterized a subset of selective ER modulators (SERMs) and selective ER degraders (SERDs) in more detail. These molecules all bind competitively to the ER ligand binding domain29. As in the bioactive screen, both SERDs and SERMS increased fbound (Figure 4A) and slightly decreased measured Dfree (Figure 4 – figure supplement 1A), with potencies ranging from 9 pM for GDC-0927 to 4.8 nM for GDC-0810 (Figure 4B). To understand how quickly these compound effects take place, we measured the change in fbound as a function of time, collecting timepoints roughly every 2 minutes beginning immediately after compound addition. Despite different physical-chemical properties, all five increased fbound within minutes of compound addition (Figure 4C, Figure 4 – figure supplement 1B), with no evidence of transient states distinct from the free diffusion and chromatin bound peaks (Figure 4 – figure supplement 1C). Presumably, ER dissociation from the chaperone complex, dimerization, and chromatin binding occur on rapid and seemingly comparable timescales. Since we cannot distinguish individual steps in these transitions, we consider the on rate of the entire process to have the effective rate constant k*on. Importantly, selective antagonists of AR and GR did not induce significant modulation of ER dynamics, further highlighting the utility of htSMT in characterizing the specificity of interactions between small molecule modulators of protein function and their cognate targets (Figure 4 – figure supplement 1D).

Selective ER modulators and degraders induce DNA binding measurable through htSMT

a) Diffusion state probability distribution for ER treated with 100 nM of exemplified SERMs and SERDs. Each state distribution is generated from 10,000 randomly sampled nuclear trajectories per assay well. Shaded regions represent the S.D.

b) Change in fbound as a function of a 12-pt dose titration of fulvestrant (5), 4-OHT (6), GDC-0810 (8), AZD9496 (9), or GDC-0927 (10) with fitted curve. Compounds colored as in (a). Error bars represent the SEM across three biological replicates.

c) Change in fbound as a function of time after agonist or antagonist addition, fitted with a single exponential. Compounds colored as in (a). Estradiol (1, green) and DMSO added for comparison. Error bars represent SEM across three biological replicates.

d) Maximum effect of SERMs and SERDs on fbound. Each box represents quartiles while whiskers denote the 5-95th percentiles of single well measurements, measured over a minimum of four days with at least 8 wells per compound per day.

e) Fluorescence Recovery After Photobleaching of ER-Halo cells, treated either with DMSO alone or with 100 nM SERM/D. Curves are the mean ± SEM for 18-24 cells, colored as in (a).

f) Quantification of FRAP recovery curves to measure recovery 2 minutes after photobleaching. Whiskers denote the 5-95th percentiles of single cell measurements.

g) Quantification long-lived tracks, each point representing the fraction of trajectories greater than 10 seconds for a single biological replicate consisting of 3-10 wells per condition. Dashed line represents the median fraction of trajectories lasting longer than 10 sec for Histone H2B-Halo, which is the upper limit of measurement sensitivity. * indicates sample with p < 0.05 as measured by t-test.

Interestingly, SERMs 4-hydroxytamoxifen (4OHT) and GDC-0810 show lower maximal increases in fbound compared with the SERDs fulvestrant and GDC-0927 (Figure 4D). Similar effects have been described previously using fluorescence recovery after photobleaching31 (FRAP), which we confirmed using our Halo-ER cell line (Figure 4E). The delay in ER signal recovery after two minutes in FRAP was consistent with the changes in fbound measured by SMT (Figure 4F). Although FRAP was used to measure these fbound differences, the technique suffers from challenges in scalability and depends heavily on prior assumption of the underlying dynamical model in the sample. SMT, alternatively, enables detailed characterization of the potency of 4OHT and GDC-0810 relative to other ER ligands in their ability to increase ER chromatin binding on a tractable timescale.

Neither FRAP nor htSMT can discriminate between recovery driven by an increase in residence time (decreasing k*off) or increasing the rate of chromatin binding (increasing k*on), either of which would result in increasing fbound. By changing SMT acquisition conditions to reduce the illumination intensity and collect long, 250 msec continuous frame exposures, only immobile proteins form spots. Under these imaging conditions, the distribution of track lengths provides a measure of relative residence times20,21,43,44. Both agonist and antagonist treatment led to longer binding times compared to DMSO, suggesting that ligand binding decreases k*off (Figure 4G, Figure 4 – figure supplement 1E). Consistent with FRAP, estradiol, GDC-0927, and fulvestrant show longer binding times compared with other ER modulators. Using fbound and k*off measurements, one can infer the k*on. In all cases the changes in dissociation rate are not proportional to the increase in fbound, and so ligand-imposed increases in k*on likely contribute to the observed change in the chromatin associated ER fraction (Figure 4 – figure supplement 1F). These data are consistent with a model wherein ER rapidly binds to chromatin irrespective of which molecule occupies the ligand binding domain, but some ligands induce a conformation that can be further stabilized on chromatin by cofactors. Consequently, these data support the hypothesis that ER may engage chromatin in mechanistically different ways31. An efficacious ER inhibitor may promote rapid and transient chromatin binding that fails to effectively recruit necessary cofactors to drive transcription33.

htSMT reveals a relationship between ER dynamics and efficacy in ER-dependent cell toxicity

As the name implies, next-generation ER degraders like GDC-0927, AZD9833, and GDC9545 were optimized to enhance degradation of ER31,45. We indeed observed compound-induced ER degradation via immunofluorescence, both in established breast cancer model lines and our U2OS ectopic expression system (Figure 5 – figure supplement 1A, B). Structural analogs of GDC0927 have been reported and optimized for ER degradation, however the correlation between ER degradation and cell proliferation is poor29,46 (Figure 5A, Figure 5 – figure supplement 1C-E). We hypothesized that by measuring protein dynamics we might obtain more precise measurements of inhibitory activity than can be achieved by assessing protein degradation. We therefore determined the potency and maximal effect of structural analogues of GDC-0927 using htSMT.

htSMT can be used to determine chemical structure-activity relationships.

a) Correlation of potency measured by ER degradation and cell proliferation in MCF7 cells (black) and T47d cells (magenta) for compounds in the GDC-0927 structural series. Cell proliferation and ER degradation-derived potencies are the mean of 4 and 3 biological replicates, respectively.

b) Change in fbound across a 12-pt dose titration of compounds 11 through 16, colored by structure. Points are the mean ± SEM across three biological replicates.

c) Correlation of potency measured by change in fbound and cell proliferation in MCF7 cells (black) and T47d cells (magenta) for compounds in the GDC-0927 structural series. Cell proliferation and SMT-derived potencies are the mean of 4 and 3 biological replicates, respectively.

Overall, these analogues exhibited a potency range of 15 pM to 12 nM and increased ER fbound by 0.4 to 0.56 (Figure 5B). Small changes in the chemical structure produced measurable changes in both compound potency and maximal efficacy as determined using SMT, a critical feature if an assay is to be used to iteratively optimize a compound for potency or efficacy.

We compared the potencies of GDC-0927 and analogues determined either via ER degradation or SMT, with the ability of each of these compounds to block estrogen-induced breast cancer cell proliferation. Potency assessed by ER degradation was not a good predictor of potency in the cell proliferation assay (Figure 5A). By contrast, SMT measurements of fbound strongly correlate with cell viability (Figure 5C; R2 of 0.83 for T47d and 0.84 for MCF7). Intriguingly, SMT EC50 values were on average 10-fold lower than those observed in the cell growth assay, suggesting that SMT may be sensitive enough to identify chemical series that would not show effects in other cellular assays, enabling the identification of starting points for medicinal chemistry that could not be obtained by other methods. This correlation between effects on protein dynamics (fbound) and protein function (suppression of cell proliferation) coupled with the throughput of the SMT system make this an attractive approach for the identification of protein modulators with novel properties.

Screening of a diverse chemical library using htSMT enables unbiased pathway interaction analysis by monitoring protein dynamics

In addition to known ER active modulators, many other compounds in our bioactive library provoked easily measurable changes in fbound. To define a threshold for calling a molecule from the screen “active,” we selected 92 compounds with different magnitudes of change in fbound to retest in a dose titration (Figure 6 – figure supplement 1A, B). We found that a 5% change in fbound was sufficient to reproducibly distinguish active compounds. Using this approach, we identified 239 compounds in the bioactive library that affected the ER mobility (Figure 6 – figure supplement 1B). Among these compounds, the correlation between the two screen replicates was high (R2 = 0.92) and the level of activity was reproducible (the slope for active molecules was 0.94). Some active compounds could be clustered based on scaffold homology, but most clusters consisted of one or only a few members (Figure 6 – figure supplement 1C, D). Structural clustering was employed to identify known ER modulators where the vendor-provided annotation was poorly defined (Figure 6 – figure supplement 1C). Our results demonstrate that htSMT is reproducible and robust when screening large collections of molecules.

Bioactive molecules targeting pathways associated with ER affect its dynamics

a) Bioactive screen results with select inhibitors grouped and uniquely colored by pathway. Change in fbound across a 12-pt dose titration of three representative compounds targeting each of HSP90, mTOR, CDK9 and the proteasome. Individual molecules are denoted by specific shapes. Error bars represent SEM across three biological replicates.

b) Change in fbound as a function of time after compound addition. Estradiol treatment (black) is compared to ganetespib (blue circles) and HSP990 (blue squares). Points are bins of 4 minutes. Error bars represent SEM across three biological replicates.

c) Change in fbound as a function of time after compound addition. Estradiol treatment (black) is compared to HSP90 inhibitors ganetespib (blue circles) and HSP990 (blue squares); proteasome inhibitors bortezomib (red circles) and carfilzomib (red squares). Points are bins of 7.5 minutes of htSMT data, marking the mean ± SEM. The shaded region denotes the window of time used during htSMT screening.

d) Track length survival curve of ER-Halo cells treated either with DMSO alone, with estradiol stimulation, or with 100 nM HSP90 or proteasome inhibition. Track survival is plotted as the 1-CDF of the track length distribution; faster decay means shorter binding times.

e) Quantification of the number of long-lived trajectories as a function of treatment condition. All conditions were normalized to the median number of trajectories in DMSO.

f) Diagram summarizing pathway interactions based on htSMT results for ER, AR, and PR.

Most active molecules from the screen were not structurally related to steroids (Figure 6 – figure supplement 1C, D). On the other hand, many compounds could be grouped based on their reported biological targets or pathways (Figure 3, Figure 6 – figure supplement 2A, B). For example, heat shock protein (HSP) and proteasome inhibitors consistently increased fbound, whereas cyclin dependent kinase (CDK) and mTOR inhibitors decreased fbound. Though many CDK inhibitors lack within-family specificity47 (Figure 6 – figure supplement 2A, pan-CDK), we found that CDK9-specific inhibitors more strongly affected ER dynamics than did CDK4/6-specific inhibitors. Furthermore, as with selective AR and GR antagonists, inhibitors targeting ALK, BTK, and FLT3 kinases that have not been shown to interact with ER have no impact on ER dynamics when assessed using SMT (Figure 3).

For the inhibitors of cellular pathways we identified, we used a dose titration to better characterize the effect of each on ER dynamics. Potencies ranged from the sub-nanomolar to low micromolar (Figure 6B), similar to the reported potencies of these compounds against their cellular targets4859. Additionally, we tested these molecules against AR (Figure 6 – figure supplement 2C) and PR (Figure 6 – figure supplement 2D). Each SHR differed meaningfully from the others in terms of the response to compounds identified through an ER-focused screening effort. Again, the magnitude of ER SMT effect was largely consistent within a target class (Figure 6A, Figure 6 – figure supplement 2A-D). The finding that structurally distinct compounds exhibited similar effects based on their biological targets favors the view that these biological targets must themselves interact with ER, and that the compounds therefore affect ER dynamics indirectly. HSP90 is a chaperone for many proteins, including SHRs. In the canonical model, hormone binding releases the SHR-HSP90 complex. Indeed, HSP90 inhibitors increased fbound for ER, AR, and PR, consistent with the hypothesis that one function of the chaperone may be to adjust the equilibrium of SHR binding to chromatin (Figure 6 – figure supplement 2B-D). Proteasome inhibition also leads to ER immobilization on chromatin60, which aligns with the results that we obtained in our htSMT screen of bioactive compounds. ER has been shown to be phosphorylated by CDK61, Src62, or GSK-3 through MAPK and PI3K/AKT signaling pathways63, and therefore inhibition of these pathways could reasonably be expected to affect ER dynamics measured using SMT. While CDK inhibition led to an increase in ER mobility, inhibition of PI3K, AKT, or other upstream kinases showed no effect (Figure 3).

Interestingly, SMT dynamics of an ER triple point mutant engineered to lack previously defined phosphorylation sites important for transactivation63 (S104A/S106A/S118A) were affected by CDK and mTOR pathway inhibitors (Figure 6 – figure supplement 3), suggesting that additional phosphorylation sites can mediate the effects of CDK9 and PI3K/AKT signaling, or that other molecular targets of CDK and PI3K/AKT can act indirectly to alter the motion of ER. The change in ER protein dynamics for characterized pathway inhibitors such as those targeting CDK and mTOR is subtle but consistent across compounds, suggesting biological meaning in these observations and highlight the need for accurate and precise SMT measurements. Hence htSMT screening offers the promise of providing comprehensive pathway interaction information or revealing novel interaction mechanisms.

Kinetic htSMT Facilitates Evaluation of Small Molecule Mechanism of Action

Since SMT can identify compounds that act either directly on a fluorescent target, or through some intermediary process, we sought to distinguish between these alternative modes of action. We hypothesized that by investigating the rate at which changes in protein motion emerge, SMT could be used to distinguish direct versus indirect effects on ER activity. Given the live-cell setting of SMT, we configured a data collection mode that allows for measurement of protein dynamics in set intervals after compound addition (kinetic SMT or kSMT). Both ER agonists and antagonists rapidly induce ER immobilization on chromatin when measured in kSMT (t1/2 = 1.6 minutes for estradiol; Figure 4C). On the other hand, HSP90 inhibitors like ganetespib and HSP990 exhibit a delay of 5 to 7 minutes before alterations in ER dynamics appear, after which we observed an increase in fbound with a t1/2 of 19.3 and 17.5 minutes, respectively. The overall effect of these compounds reached a plateau after an hour (Figure 6B). Proteasome inhibitors, e.g. bortezomib and carfilzomib, acted even more slowly, with changes in ER dynamics emerging only after 40 minutes, and slowly increasing over the four-hour measurement window (Figure 6C). Hence this exploration of SMT kinetics represents an important tool that can facilitate differentiation between on-target and on-pathway modulators. We believe that this approach will permit, for example, rapid mechanistic characterization of active compounds in a drug discovery setting.

To further differentiate the effect of pathway inhibitors on ER protein dynamics, we sought to characterize relative ER residence times for each such molecule. In contrast to the SERDs and SERNs, although HSP90 inhibition by HSP990 and ganetespib resulted in an increase in fbound, we observed a decrease in the total number long binding events by two- and four-fold, respectively, while the binding times were similar to that observed with DMSO alone (Figure 6D-E). These results suggest HSP90 inhibition primarily increases k*on while leaving k*off largely unaffected. On the other hand, inhibition of the proteasome led to an increase in both the number and duration of long binding events. These results demonstrate that ER-chromatin binding can be modulated by changing the rate of association or disassociation, and that the inhibition of specific cellular partners can affect these rates differentially. Taken together with the different kinetics for direct ER, HSP90, and proteasome modulators, our data suggest that each class of molecule alters ER dynamics through distinct mechanisms.

Discussion

Many pathways that regulate the fundamental biochemistry of cells depend upon the interaction of protein “sensors” with distinct protein “effectors” that engage transiently to trigger a change in cell physiology. Although the fundamentals of this process have long been appreciated, biochemical investigation of these protein interactions has typically required in vitro reconstitution or has been interrogated through pull-down assays after cell permeabilization. Here we report the combination of SMT, a type of super-resolution microscopy, with high content microscopy as a means of visualizing individual protein motion in millions of live cells, and under circumstances where the effect of small molecule inhibitors can be assessed quantitatively. We demonstrate the capabilities of the htSMT platform by analyzing the behavior of SHRs, a class of sensors that mediate hormone-induced modulation of gene expression, and in particular the dynamics of the ER.

To validate our htSMT platform, our analysis initially focused on the very rapid immobilization of SHRs on chromatin observed in cells exposed to their established, cognate steroid ligands. The technique proved to be highly quantitative, effectively evaluating ligands whose potency differ by more than four orders of magnitude, and readily characterizing differences in both the sensitivity and the selectivity of the steroid receptor family. These observations prompted us to apply htSMT to screen thousands of bioactive compounds, which we hypothesized would reveal new chemical matter as well as enable comprehensive pharmacologic dissection of ER pathway interactions. As further validation of the technique, automated screening of ER dynamics using htSMT identified all 30 known steroid ligands from a library of 5,067 bioactive compounds. The potency of these steroid ligands with respect to alterations in ER dynamics varied across a thousand-fold range, demonstrating the dynamic range of this single experimental setup. Among molecules known to behave as ER signaling inhibitors, the change in ER dynamics correlated closely with the ability of these compounds to block estrogen-induced proliferation of estrogen dependent breast cancer cells, demonstrating the ability of htSMT to document structure-activity relationships in a chemical series across disparate and biologically relevant readouts. Since our analysis relies only on detection of changes in protein dynamics, this unbiased readout will prove broadly useful in screening libraries of compounds to identify starting points for the development of new therapeutics. In fact, our analysis identified 209 non-steroidal molecules that affect ER dynamics, which are likely to act elsewhere in the network of ER-interacting proteins.

Our characterization of the htSMT platform and subsequent screen highlighted some important considerations for future screening efforts using single molecule tracking. Notably, the observation that cell-to-cell variability is the dominant driver of assay variance, when compared to other sources like the microscope or the assay plate, suggests that an even larger field of view would sample cells more effectively and result in a more stable dynamical estimate (Figure 3 – figure supplement 3). This could be particularly important for detecting subtle dynamical changes such as those seen with the mTOR inhibitors where the maximal change in fbound was only around 5% (Figure 6). Similarly, for our screening assay we chose a frame interval of 10 msec, which proved very sensitive to detecting a wide range of compound effects but is necessarily limited in the types of perturbations it can detect. The diffusion of the most mobile ER population was well below the upper limit of detection for 10 msec, suggesting that faster frame rates were not necessary, but this may not be the case for other protein targets. On the other hand, ER has been reported to have multiple low-mobility chromatin binding states (Wagh 2023), but these low mobility states are below the assay lower bound set by our localization error and would require a slower frame interval to differentiate.

The observation that by screening large libraries of bioactive compounds for an effect on protein motion, htSMT can define networks of biochemical signaling pathways is a critical outcome of this high-throughput platform. Protein dynamics in the cell are not governed by singular interactions between any two proteins but by biochemical networks within the cell that intersect with the protein under observation. Our work enables construction of a putative interaction network connecting nodes involving different proteins, the identities of which were deduced based on the impact of known inhibitors on the dynamics of steroid receptors (Figure 6F). The interaction map derived from unbiased htSMT screening recapitulates many known biological interaction partners of the ER, in a single experimental setup using protein motion as a sole readout. More experiments are necessary to determine which nodes represent direct physical interactions and which occur through intermediaries. To our knowledge, the impact of HSP inhibition in increasing ER-chromatin association has never been described, neither has the link between inhibition post-translation modifying enzymes like the CDKs or mTOR and ER dynamics ever been described. We believe that identifying additional cellular interaction networks through htSMT will provide an important foundation for the broader understanding of biochemical regulatory mechanisms. For example, CDK4/6 inhibitors are co-administered with SERDs and SERMs to improve therapeutic outcomes64,65. CDK4/6 inhibition only minimally affects ER dynamics in SMT (Figure 3), supporting the view that the combination of CDK4/6 inhibition with ER antagonists functions through a non-redundant ER-independent mechanism66.

Most cell-based assays cannot easily be configured for kinetic analysis of treatment effects, as these typically involve fixed-endpoint, aggregate readouts. With such endpoint assays, biochemical feedback and compensatory mechanisms often confound interpretation of the direct effect of a change in treatment conditions. SMT, however, when implemented in a high-throughput system that exhibits consistency over long time intervals, permits kinetic analysis of treatment effects on protein dynamics in real time with high reliability. Such kinetic analyses help to define pathway cascades; in a first instance, they can be used to rapidly identify compounds that likely engage directly with a therapeutic target.

Mechanistically, an increase in fbound provoked by a change in cell conditions suggests that molecular interactions with the protein being analyzed have been stabilized (either made more probable or longer-lasting), while a decrease in fbound suggests the opposite: molecular interactions that have been made less robust. Unexpectedly, we observed both SERDs and SERMs, which antagonize ER, cause an increase in nuclear bound fraction, likely via chromatin binding, in a way that mimics what is seen with traditional ER agonists. We showed that this mechanism is not a commonality among all SHRs, and indeed inhibitors of AR and GR behaved more like the prototypical competitive antagonist. Inhibition of HSP90 also produces a marked increase in ER fbound, though these binding events are more transient (Figure 6E). Transcription factors are thought to find binding sites through free 3D diffusion, 1D sliding, and transient, non-specific DNA binding5,12,20,21,24. ER antagonists may function by promoting ER binding to non-specific decoy chromatin sites, thus reducing the amount of ER able to activate transcription at ER-responsive genes3133. Previous work has also shown that SERDs, and to a lesser extent SERMs, induce an alternate conformation of ER, thereby inhibiting co-factor recruitment31. If co-factors like CBP and p300 stabilize ER-chromatin binding, then efficacious inhibitors might exhibit shorter binding times compared with agonist stimulation31. The SERMs 4OHT and GDC-0810 dramatically increase ER-chromatin binding frequency and only modestly increase binding times; fulvestrant and GDC-0927 strongly increase both ER-chromatin binding frequency and residence time. Therefore, htSMT suggests that agonists and antagonists of SHR signaling operate through a previously unappreciated, unified mechanism of chromatin immobilization, meriting further investigation.

Lastly, we note that fast-SMT, as implemented here, can define relationships between the structures of chemical inhibitors and their effects on a fundamental property of protein regulatory elements: their dynamics in living cells. In our hands, these measurements proved far more reliable in predicting the ability of an ER antagonist to block cell proliferation than an ER degradation assay, a conclusion that was only able to be drawn due to the scale of htSMT screening capabilities. Notably, saturable dose responses were observed for ER antagonists at much lower concentrations using htSMT than with ER degradation assays, suggesting that this method will be more sensitive for identifying novel therapeutic agents than with corresponding traditional assay formats. Thus, SMT can identify promising compounds the activity of which might not otherwise be measurable, provide insight into their mechanism(s) of action, and in the native environment of the cell.While ER is a transcription factor, the same principles may apply broadly to regulatory mechanisms of proteins with diverse function. In this context, we conclude that the application of technologies for measuring protein dynamics at scale will prove broadly applicable to the elucidation of biological mechanisms.

Author Contributions

D. A., H. P. B., D. T. M., & Y. T. conceived the research project; D. A., H. P. B., X. D., E. G., A. He., J. H., H. L., D. T. M., S. E. P., R. T., & Y. T. contributed to study design; H. C., K. F., E. G., R.Kr., H. L., D. T. M., S. E. P., A. S., & R. T. performed experiments; H. C., E. G., H. L., D. T. M., A. S., & R. T. generated reagents; L. A., D. A., A. Ha, A. He., H. L., D. T. M., C. S. & R. T. analyzed data; S. A., R. B., M. B., R. G., A. Ha, A. He, S. L. J., R. Ke., A. K., J. L., & L. M. designed and wrote software; R. B., X. D., R. Ke., K. L., B. M., & D. T. M. designed and built hardware; D. A., H. P. B., R. B., J. H., A. K., & R. Ke. Supervised the work; L. A., D. A., A. He., H. L., D. T. M., S. E. P., & R. T. wrote the original draft; L. A., D. A., H. P. B, R. B., X. D., A. He., J. H., H. L., D. T. M., R. T., & Y. T. reviewed and revised of the original draft.

Acknowledgements

The authors extend their deepest gratitude to all the employees and consultants of Eikon, past and present, especially Caitlyn Bonilla, Tiffany Cheng, Michael Hirte, David Hoffman, Fedor Ilkov, Yan Li, Anuja Lohia, Edith Martinez-Soto, Eugene Masterov, Mai Nguyen, and Gregory Snyder. Their tireless work enabled the experiments described here. We thank Rand Miller, Roger Perlmutter, Yan Li, Robert Tjian for helpful discussions and critical feedback on the direction of our investigation and on the resulting manuscript. Eikon Therapeutics provided all funding.

Declaration of interests

The authors are employees and/or shareholders of Eikon Therapeutics.

Supplementary Figures

Overview of the htSMT workflow

a) Schematic representation of the end-to-end automated htSMT process, beginning with the physical preparation of the sample; image collection on a custom-assembled super-resolution microscope; detection of fluorescent spots and subsequent trajectory reconstruction; image processing and QC; extraction of relevant features by combining trajectories with cellular image data; and finally interpretation by a biologist.

b) Schematic view of the automated htSMT platform.

c) Example Hoechst images to be detected and removed during the data QC phase. Debris, incorrect focus, uneven illumination and empty FOVs which can contaminate screening data are are removed using a deep-learning based classifier of the Hoechst image.

Characterization of htSMT system performance.

a) Distribution of localization error measured across multiple independent wells using Histone H2B-Halo cells. Median localization error of 39 nm.

b) Number of measurable nuclei in a 94 by 94 µm FOV. Each box plot is the distribution over wells in one 384-well plate.

c) Single-cell state distributions for Histone H2B, CaaX and free HaloTag cell lines.

d) Cumulative number of trajectories as a function of imaging time for each individual cell type for cells labeled with 10 pM JF549. Shaded error bars are 1 S.D.

Activation of steroid hormone receptors changes free diffusion and impacts downstream gene expression.

a) Cartoon illustration of SHR function. Under basal conditions, SHRs are sequestered in complex with HSP90 and other cofactors. Upon ligand binding, the receptor dissociates from the inactive complex, dimerizes, and binds to DNA.

b) mRNA transcript levels in log(FPKM) of AR, ESR1, PR, and NR3C1 for the engineered U2OS cell lines compared to the parental U2OS line as well as three reference breast cancer cell lines.

c) Distribution of diffusive states for Halo-AR, Halo-ER, Halo-GR, and Halo-PR in U2OS cells before and after stimulation with an activating ligand. Arrows indicate the mean Dfree. Shaded error bands represent 1 S.D.

d) Selectivity of individual SHRs to their cognate ligand compared with other steroids, as determined by Dfree. Error bars represent SEM.

e) Reference gene sets induced after 24 hrs of stimulation with 25 nM estradiol. The top five induced gene sets for Halo-ER ectopic expression increases were not significantly induced in the parental U2OS line.

f) Bar plot showing the -log(q value) from the top 50 most significantly induced gene sets after estradiol stimulation. Gene sets characteristic of ESR1 or the estrogen response are colored green.

Experimental design for bioactive molecule screen.

a) Cumulative number of trajectories as a function of imaging time for DMSO-treated cells labeled with 20 pM JF549. Shaded error bars are 1 S.D.

b) Median of the jump length distribution, reported as a function of frame number and calculated in 10-frame increments comparing DMSO- and estradiol-treated wells. Curves are the mean, shaded error bands are 1 S.D.

c) Filtering criteria for bioactive compounds in the screen: compounds without SMILES data; compounds with a molecular weight less than 226 Da or more than 800 Da; heavy metals and compounds with more than 60% non-carbon atoms; and compounds with known fluorescent structures (e.g. fluorescein) were removed before screening. During and after screening additional compounds were removed if they were observed to be fluorescent or if they failed to meet data quality criteria.

d) Composition of the bioactive screening set based on the biological pathway targeted as provided by the manufacturer.

Screen of bioactive molecules produces robust data with good assay performance.

a) Change in fbound repeatability of screening results assessed over two biological replicates. Compounds in magenta were identified as active if their magnitude of change in fbound when averaged across both replicates was greater than 0.05. Of the expected 30 positive control compounds, 26 significantly increased fbound in both replicates (gray outline). Three compounds increased fbound but were only present in one replicate after filtering. Exemplified positive controls including the agonist estradiol (1) and the antagonists fulvestrant (5), 4-OHT (6) and bazedoxifene (7) are exemplified. A linear regression was fit to the magenta compounds to determine slope and correlation.

b) Dose titration of estradiol for each plate in the bioactive molecules screen fit with a logistic regression to determine potency.

c) Z’-factor analysis for the bioactive screen. Each point represents the Z’-factor of a single 384-well plate, measuring the difference between DMSO and 25 nM estradiol treatment. Plates with very low Z’-factors were removed from further analysis.

d) EC50 values extracted from each curve fit in (c). Shaded region is a three-fold range in potency.

e) Distribution of change in fbound for control DMSO wells.

f) Change in fbound for unfiltered screening results plotted per assay plate with the negative control (DMSO, blue) and positive control (Estradiol, red) highlighted. Each tick mark on the x-axis is a separate assay plate. Each point is the mean per compound per assay plate, error bars are the SEM.

g) Boxplot showing the number of cell nuclei measured for each compound tested.

Whiskers are the 1st and 99th percentiles.

Contributions of different sources of variability of jump length in the htSMT bioactive compound screen against ER.

a) Change in fbound for 30 known ER-interaction molecules circled in (b), ordered by the magnitude of effect. Error bars are the SEM.

b) Variance in mean jump length after subsampling increasing numbers of jumps from the bioactive screen, resampling at different levels of aggregation (microscope, plate, well, etc.). Law of large numbers scaling would result in a line with slope −1, and deviations from linearity are the result of intrinsic sources of variance over microscopes, plates, etc.

c) Sources of variance attributed to differences between microscopes, plates, sample wells, fields of view (FOV) and cells. Solid bars are analyzed from all wells in the bioactive screen, half-tone bars are assessing DMSO wells only. The asterisk highlights the expected well-to-well variability introduced when assaying compounds.

Antagonists of ER (a), PR (b), AR (c) and GR(d) show distinct effects on target protein dynamics. Change in fbound after addition of 1 µM antagonist either in the absence or presence of the cognate agonist.

Figure 3 – supplemental table 1. A list of the compounds composing the bioactive screening library.

SERMs and SERDs decrease free diffusion and increase fbound rapidly after addition.

a) Normalized occupation of diffusive states (diffusion coefficient 0.2 – 100 µm2/sec) for 100 nM SERD or SERM-treated samples compared with DMSO. Histograms are normalized to integrate to 1. Shaded regions represent bin-wise standard deviations. Curves are the average of 3-4 biological replicates, with 8 well replicates per condition (>1200 cells total)

b) Results from a single-exponential association fit to the data in Figure 4C.

c) Change in diffusive state distribution as a function of time after compound addition for estradiol and fulvestrant. Each curve represents the mean of three biological replicates. Shaded regions are the bin-wise standard deviation.

d) Change in fbound of selective AR or GR antagonists from the bioactive screening set.

e) Track length survival curve of ER-Halo cells treated either with DMSO alone or with SERM/D. Track survival is plotted as the 1-CDF of the trajectory length distribution; faster decay means shorter binding times.

f) Table of results of the slow decay rate constant (kslow) from residence time SMT curve fits. Fits were performed on the aggregate of three biological replicates. Taken together with the fbound determined in 4D, an upper limit of the inferred kon can be calculated from the equation assuming all bound molecules will have k* = k. Cells marked with an asterisk are ones in which kslow could not reliably be distinguished from photobleaching.

GDC-0927 structural variants characterized by ER degradation or cell proliferation assays.

a) Western blot of the ER-expressing breast cancer cell lines MCF7 and T47d compared to ER expression in ER-null lines SK-BR-3 or U2OS. Samples were treated with fulvestrant for 24 hours prior to lysis. Fulvestrant treatment leads to the degradation of ER, even when fused to HaloTag.

b) Example of GDC-0927 induced degradation of ER in different cell lines measured by immunofluorescence against ER. Cells were exposed to compound for 24 hours prior to fixation. Image pixel intensities are equivalently scaled.

c) Example compound dose titrations showing change in mean nuclear intensity as a function of compound concentration.

d) Example compound dose titrations measuring cell proliferation, normalized to DMSO-treated cells, of cells treated with analogs of GDC-0927.

Deeper investigation of bioactive screening data identifies cutoff for active molecules.

a) Examples of compound effects on ER fbound from a dose titration experiment. 92 compounds from the primary screen are ranked based on the magnitude of their effect. Compounds colored in black repeatably showed dose-dependent changes; magenta compounds were inactive in a dose titration.

b) Example dose titration of compounds of increasing overall change in ER fbound.

c) Two replicates of the bioactive screen as in Figure 3. Active compounds are colored based on structural scaffold.

d) A quantification of the number of compounds associated with any given cluster. Singletons (cluster 21) represent the majority of active compounds.

Some pathway inhibitors modulating ER dynamics are specific to ER.

a) Change in fbound of ER following treatment with inhibitors of HSP90, the proteasome, mTOR, and CDK from the initial bioactive screen. CDK inhibitors were further stratified into CDK4/6 inhibitors, CDK9 inhibitors, or inhibitors without strong selectivity for a particular family (pan-CDK). Line represents the median for each target. * indicates sample with p < 0.05 as measured by t-test.

b) Change in fbound of ER for 97 bioactive molecules, colored by their pathway annotation. Compounds are plotted in their rank order based on their effect in ER. Error bars are the SEM of three biological replicates.

c) Change in fbound of AR for the same molecules from (b). Compounds are plotted in their rank order based on their effect in ER. Error bars are the SEM of three biological replicates.

d) Change in fbound of PR for the same molecules from (b). Compounds are plotted in their rank order based on their effect in ER. Error bars are the SEM of three biological replicates.

Dose titration plots of ER(S104A/S106A/S118A) with mTOR and CDK9 compounds. Each point represents the mean and SEM of three biological replicates.

Supplemental movie 1. Example SMT field of view as shown in Figure 1. A representative mixture of H2B-Halo, Halo-CaaX, and free Halo expressing cells. Video is played back at 10 frames per second, 10x slower than real time.

Supplemental movie 2. Example field of view of SMT from Halo-ER cells. Video is played back at 10 frames per second, 10x slower than real time.

Supplemental movie 3. Example field of view of SMT from Halo-ER cells treated with 25 nM estradiol for 1 hr prior to imaging. Video is played back at 10 frames per second, 10x slower than real time.

Methods

Cell Lines

U2OS (ATCC Cat. No. HTB-96), MCF7 (ATCC Cat. No. HTB-22), T47d (ATCC Cat. No. HTB-133) and SK-BR-3 (ATCC Cat. No. HTB-30) were grown in DMEM (Cat. No. 1056601, Gibco DMEM, high glucose, GlutaMAX Supplement, Thermofisher) supplemented with 10% Fetal Bovine Serum (Cat. No. 16000044, Thermofisher) and 1% pen-strep (Cat. No 15140122, Thermo Fisher) and maintained in a humidified 37 °C incubator at 5% CO2 and subcultivated approximately every two to three days.

HaloTag-expressing cell lines

For H2B, CaaX, ER, AR, and PR-HaloTag fusions, mammalian expression vectors containing the fusion gene under the control of a weak L30 promoter and containing a Neomycin resistance marker were transfected into U2OS cells at 70% confluence using FuGENE 6 (Cat. No. E2691, Promega). Transfected cells were selected with G418 (Cat. No. 10131027, Thermo Fisher) at 500 µg/mL, then clonally isolated. Clones expressing the desired fusion gene were determined first by staining with 100 nM JF549-HTL (Cat. No. GA1110, Promega) and 50 nM Hoechst 33342 and identifying clones with the expected distribution of JF549 signal. Between three and six clones were subsequently tested using SMT conditions for response to a control compound, and the most homogenous clones were subsequently expanded for further testing. Unless otherwise specified, all experiments are with a single, clonally isolated cell line. Because U2OS cells express GR endogenously, HaloTag was inserted right before the stop codon of endogenous NR3C1 via homology-directed repair using CRISPR/Cas9. The HaloTag knock-in was validated by imaging using HTL-JF646 staining and through DNA sequencing67.

Western Blot

Cells were grown in the same conditions as described previously. 1.5×106 cells were seeded per well in a 6-well plate in DMEM overnight, followed by compound treatment (DMSO or 100nM fulvestrant) the following day for 24 hours. Cells are lysed in 200 μL 1X Cell Lysis Buffer (catalogue number 9803, Cell Signaling). Protein lysate concentration is then determined using BCA protein assay kit (Catalog number 23225, Pierce™ BCA Protein Assay Kit) following manufacturer instructions. Capillary Western Immunoassay were performed using Jess Protein Simple following manufacturer’s instruction (protein simple, USA). Levels of l1ER (1:100, RM-9101) were normalized to loading control β-tubulin (1:100, NC0244815 LI-COR 92642213, Thermo Fisher). The peaks were analyzed with the Compass software (Protein Simple, USA).

RNA-seq

Cells were seeded into 12-well tissue-culture treated plates at densities of 250,000 cells (U2OS-WT), 200,000 cells (U2OS-ER), or 300,000 cells (MCF7, SKBR3, T47d) per well. 24 hours later, cells were treated with estradiol at a final concentration of 25nM for 24 hours. To process cells for total RNA, cells were washed twice with ice-cold PBS, lysed with 350uL Buffer RLT (Qiagen 79216), scraped off the plate (Fisher 08100241), frozen on dry ice and stored at –20 degrees C. Cell lysates were then thawed, homogenized using QIAshredder columns (Qiagen 79656), and processed through the Qiagen RNeasy Micro kit (Qiagen 74004) using the standard protocol and including the optional on-column DNase digestion step (Qiagen 79254). All samples had a RIN score of 10 by TapeStation (Agilent 5067-5576). RNA sequencing libraries were prepared from total RNA by Novogene (CA). In brief, mRNA was purified from total RNA using poly-T oligo-attached magnetic beads and fragmented. First-strand synthesis was performed using random hexamer primers, second-strand synthesis was performed using dTTP, and libraries were prepared after end repair, A-tailing, adapter ligation, amplification, and purification. Libraries were sequenced on an Illumina NovaSeq with paired 150 cycle reads. For data analysis, paired-end reads were aligned to the hg38 reference genome using Hisat2 v2.0.5, featureCounts v1.5.0-p3 was used to count the number of reads mapped to each gene, and differential expression analysis was performed using DESeq2 (1.20.0).

Single Molecule Tracking Sample preparation

Cells were seed on tissue culture treated 384-well glass-bottom plates at 6000 cells per well. Seeded cells were then incubated at 37 °C and 5% CO2 to allow adhesion overnight. For all SMT experiments, cells were incubated with 5-100 pM of JF549-HTL (Cat. No. GA1110, Promega) and 50 nM Hoechst 33342 for an hour in complete medium. Cells were then washed three times in DPBS and twice in imaging media using an EL406 plate washer, which is fluoroBrite DMEM media (Cat. No. A1896701, Thermo Fisher) supplemented with GlutaMAX (Cat. No. 35050079, Thermo Fisher) and the same serum and antibiotics as growth media. Where appropriate, compounds were serially diluted in an Echo Qualified 384-Well Low Dead Volume Source Microplate (0018544, Beckman Coulter) to generate dose-titration source material. Compounds were administered at a final 1:1000 dilution in cell culture medium. Each dose of a compound has at least 2 replicates per plate and 3 plate replicates, 20 DMSO control wells and 2 no dye control wells were randomized across each plate. Unless otherwise specified, compounds were allowed to incubate for an hour at 37 °C prior to image acquisition.

Image Acquisition

Unless otherwise stated, all image acquisition using SMT was performed on a custom-built HILO microscope based on a Nikon Ti2, motorized stage, stage top environmental chamber (OKO labs), quad-band filter cube (Chroma), custom laser launch with 405 nm, and 561 nm wavelengths, coupled to a Nikon TIRF illumination module by fiber optic element (KineFlex HPV-P-3-S-405..640-0.7-APC-P2) delivering >10 mW and >600 mW of power in a gaussian beam with a FWHM of approximately 250 µm to the back focal plane of the objective. Angle of inclination and beam direction were adjusted by micrometer on the TIRF illuminator and empirically set to maximize and flatten the signal across the camera field of view. Fluorescence emission was passed through a high-speed filter wheel (Finger Lakes Instruments) and collected with a backlit CMOS camera (Prime 95b, Teledyne). Images were acquired with a 60X 1.27 NA water immersion objective (Nikon). Environmental chamber was set to 37° Celsius, 95% humidity, and 5% CO2. For each field of view, 200 SMT frames were collected at a frame rate of 100 Hz, with a 2 msec stroboscopic laser pulse. 10 frames of the Hoechst channel were collected at the same frame rate for downstream registration of trajectories to nuclei. Sample focus was maintained using the reflection-based Perfect Focus System to determine the position of the coverglass and apply an empirically-determined offset to focus into the sample.

Experiment design and sample size

All assays were designed with high throughput screening in mind. Unless otherwise stated, experiments were performed with three biological replicates and within each assay plate having at least three well replicates. In instances where a specific plate or a portion of a plate failed to meet assay quality criteria (e.g. no respons from a positive control), those data point were omitted and, where possible, the assay was repeated. For htSMT experiments, samples were prepared and acquired such that a minimum 20 cells were sampled per assay well, resulting in an minimum of 21,000 detections per well; 60 cells and 63,000 detections minimum per condition per assay plate.

htSMT Image Analysis

Image acquisition produced one JF549 movie and one Hoechst per field of view. The JF549 movie was used to track the motion of individual JF549 molecules, while the Hoechst movie was used for nuclear segmentation. Tracking was accomplished in three sequential steps – detection, subpixel localization, and linking – using a combination of existing methods. Briefly, spots were detected using a generalized log likelihood ratio detector to test every 11×11 pixel window using a gaussian kernel with a radius of 1.25 pixels and with a log likelihood detection threshold 1468. After detection, the estimated position of each emitter was refined to subpixel resolution using Levenberg-Marquardt fitting6971 with an integrated 2D Gaussian spot model72 starting from an initial guess afforded by the radial symmetry method73. Detected spots were linked into trajectories using a custom modification of a hill-climbing algorithm with a maximum linking radius of 1.25 µm and allowing a maximum of 2 gap frames where a spot may go undetected but still be linked within the same trajectory10,74. The same detection, subpixel localization, and linking settings were used for all movies used in this manuscript.

For nuclear segmentation, all frames of the Hoechst movie were averaged to generate a mean projection. This mean projection was then segmented with a neural network trained on human-labeled nuclei, the output of which is a mask assigning a semantic category to each pixel in the image75. Each spot was assigned to at most one nucleus using its subpixel coordinates.

To recover dynamical information from trajectories, we used state arrays35, a Bayesian inference approach, with the “RBME” likelihood function and a grid of 100 diffusion coefficients from 0.01 to 100.0 µm2 s-1 and 31 localization error magnitudes from 0.02 to 0.08 µm. For each assay well, 10,000 trajectories were randomly sampled from the aggregated pool of nuclear trajectories, and this set of trajectories was used for state inference. After inference, localization error was marginalized out to yield a one-dimensional distribution over the diffusion coefficient for each field of view. For single-cell analysis, we performed SMT and nuclear segmentation on a mixture of U2OS cells bearing H2B-HaloTag, HaloTag-CaaX, or free HaloTag. We then evaluated the marginal likelihood of each of a set of 100 diffusion coefficients on the set of trajectories within each segmented nucleus30. These marginal likelihood functions were clustered with k-means (3 clusters), and the marginal likelihood functions for each cell were ordered by their cluster index to produce the heat map. To estimate the fraction bound (fbound), we integrated the state array posterior distribution below 0.1 µm2 s-1. To estimate the free diffusion coefficient (Dfree), we computed the mean of the posterior distribution above 0.1 µm2 s-1 using the following equation:

htSMT data quality control

High content imaging approaches require image quanity control to systematically remove aberrant measurements from the set. Primarily these are fields of view that are empty, that are out of focus, or contain large piece of debris in the well or occlusion of the objective. To detect and remove these fields of view, we trained a convolutional neural network classifier to score Hoechst image quality on a scale of −1 to 1 (low quality to high quality). Annotations from five independent annotators on a training set of ∼1000 Hoechst images were used as input for the model. After training, a threshold for filtering was empirically set so as to remove problem FOVs such as those exemplified in Figure 1 – figure supplement 1c). Screening wells were only considered if more than two fields of view passed QC, and screening plates were only considered if more than 8 control wells could be included for normalization. During screening, compounds with a standard deviation in fbound greater than 0.15 were omitted from analysis and rescreened where possible.

htSMT data analysis

Tracking results from the automated processing pipeline were analyzed using KNIME or Spotfire (TIBCO). Individual fbound or Dfree measurements were associated with experimental metadata and aggregated by condition. Change in fbound was calculated as the difference between the fbound of each well and the median fbound of DMSO in the same plate. Wells that had no cells in the field of view or in which the field of view was out of focus were omitted from further analysis. Compounds were assessed for assay interference using the median fluorescence intensity of the tracking channel and omitted if it was more than 3 standard deviations higher than the median intensity of the DMSO wells. Similarly, plates where the active and negative controls could not be clearly resolved or where the significantly deviated from the performance of the rest of the screen were removed from further analysis. Finally, compound with a variance more than three standard deviations higher than the average compound variance (41 compounds; 0.08%) were removed from downstream analysis. Z’-factor between the active controls on a plate and DMSO was calculated as previously described38. EC50 values were calculated in Prism (GraphPad) by first log-transforming the molecule concentrations and then fitting to a four parameter logistic curve.

Sources of variability analysis

The contribution of microscope-to-microscope, plate-to-plate, well-to-well, FOV-to-FOV and cell-to-cell variability on the 2D jump distribution was estimated using a subsampling approach. We consider a simplified model where the observed jump length distribution (Y) is a function of the intrinsic stochasticity in jump length due to diffusion (X) with additional biases introduced at the cell, FOV, well, plate or microscope level:

For simplicity the biases are assumed independent, although these biases may indeed have some dependence.

To estimate the variance of each bias term, we computed the variance in sample means for different resampling schemes. Firstly we sample N jumps from all data from one replicate of the bioactive screen and compute the average. The resulting sample mean is the average over all sources of variability (Bmicroscope, etc.). In a second step, we sample a microscope and randomly sample all jumps from all plates on that microscope. This averages over all other sources of variability except Bmicroscope can be expected to approach Var(Bmicroscope) for large N. We continune this same sampling scheme at the Plate-, Well-, FOV- and cell-level, such that for large N results in estimated Var(Bplate), Var(Bwell), Var(BFOV), and Var(Bcell) respectively. 1000 rounds of sampling were performed for each resampling scheme, either including all wells or only those containing the vehicle DMSO.

Clustering active molecules

Chemical structure-based clustering was performed on molecules identified as active (239 in total). Molecular frameworks were computed as described by Murcko et al and as implemented in Pipeline Pilot76. Molecular frameworks were clustered using functional class fingerprints (FCFP_4) with a similarity threshold cut-off of 0.3 Tanimoto distance. A total of 21 clusters were obtained with singletons being the major class (124 molecules). The next largest group was the flavone class represented by 27 members, followed by a couple of diverse classes within the steroidal class with 14 and 20 members respectively. The other category is the stilbene class with 7 members representing tamoxifen as one of the members. The remaining actives (47 molecules) were grouped into one 3-membered cluster and all the others with 2 members per cluster.

Kinetic Experiments

Cells were seeded into a 384-well plate the day before, dyed, and washed as described above. 1 well with 25 FOVs per well were taken as a baseline reading. Then, while imaging, compound was manually added to each well to a final concentration of 100 nM. Data was then collected for 20 wells. A pause was included between each FOV such that the entire imaging regime covers the assay window. Change in fbound was determined per-well relative to t=0.

For assays extending to 4 hours, the plate was imaged twice with 8 FOVs per well with different FOV locations per readthrough to prevent photobleaching from impacting data. All data presented represents was performed in three different biological replicates.

Residence Time Imaging

Sample preparation and execution of residence time imaging experiments were conducted in a similar manner to the single molecule tracking assay described above with a few exceptions. Samples were dyed with 1 – 10 pM JF549 (Promega) and 50 nM Hoechst 33342 for an hour. 400 frames per field of view were collected with a camera integration time was set to 250 msec, and laser sources reduced to 5 mW at the objective. During image acquisition, lasers were on continuously. Compound incubation ranged from 1 to 4 hours. At least 8 well replicates were collected per condition.

Residence Time analysis

Quantifying transcription factor binding times on DNA is an open problem with multiple proposed solutions21,77,78. Here we adopted an approach similar to Hansen et. al. 19. Image processing, including spot detection, localization, and track reconnection were performed using the same methods described above. Because residence time imaging selectively tracks slow-diffusing molecules, individual localizations were limited to a 300 nm maximum displacement for individual jump reconnections. Sets of trajectories for each field of view were binned into 1-CDF distributions as previously described and fit to a two exponent decay model:

The kslow term comprises both the rate of molecule unbinding (koff) as well as photobleaching (kbleach) and diffusion of chromatin out of the focal volume. Often approaches attempt to derive the unbinding rate by applying a correction such as subtracting a bleaching rate measured either directly in the sample or by using a separate control sample19,21, where Tcorrected = (koff-kbleach)-1. The inverse relationship between Tcorrected and kbleach makes it highly sensitive and nonlinear to noise in kbleach. Because the 1-CDF distributions of compound-treated ER samples are so close in decay rate to the control Histone H2B, background subtraction yields unfeasible koff values after correction. Instead we report only the uncorrected kslow values with the understanding that these represent a lower-bound of the actual koff, but that within-experiment comparisons can be made between conditions.

Fluorescence Recovery After Photobleaching

Images were acquired on a custom-built HiLo microscope as described above with a Spectra Light Engine RS-232. Stimulation was directed using a miniscanner coupled with a Coherent OBIS 561nm 100 mW laser. All imaging was performed using a 60X 1.27 NA water immersion objective (Nikon). All experiments were performed at 37° Celsius. For FRAP experiments, Cells were seeded into a 384-well plate the day before, labeled with 50 nM HTL-JF549, and washed as described above. Compound was added to 100 nM final an hour before imaging. Then, a prebleach image was acquired by averaging 10 consecutive images. Then 8-10 regions were bleached (2 background, 6-8 cells) and 2 regions in cells were unbleached. Regions that were bleached were bleached at 10% power without scanning. For the next 30 seconds, an image was acquired every 200 ms, then every 1 second for 2 minutes. The background-subtracted average intensity was measured in the region of interest over time and normalized to the average of the fluorescence in the baseline images, then normalized to the unbleached regions to account for readout-induced photobleaching of fluorophores. Data from 18-24 cells were pooled per experiment for 3 biological experiments.

Immunofluorescence

Cells were grown in conditions as described previously. Cells were seeded in glass bottom 384-well plates coated with 0.05mg/ml PDL (Cat. No. A3890401, Thermofisher) at 6000 cells per well for Halo-ER U2OS cells and 8000 for MCF7 and T47d cells. Cells were grown overnight followed by compound treatment on the second day for 24 hours at 37 °C and 5% CO2. Compounds were serially diluted in an Echo® Qualified 384-Well Low Dead Volume Source Microplate (0018544, Beckman Coulter) to generate a 21-point dose response at 1:3 dilution starting from a concentration of 10mM. Compounds were administered at a final 1:1000 dilution in cell culture medium. An 8 to 12-point dose response was selected based on the potency of each compound. Each concentration was replicated at least once per plate and has at least 2 plate replicates. Cells were fixed by addition of paraformaldehyde (Cat. No. 15710-S; Electron Microscopy Sciences), with a final concentration of 4% for 20 minutes. Cells were then permeabilized using blocking buffer containing 1% bovine serum albumin and 0.3% TritonX100 in 1x PBS for an hour at room temperature. Immunofluorescent staining of ER was carried out using l1ER antibody (1:500, RM-9101) diluted in the same blocking buffer for 1 hour at room temperature. Extensive washing with PBS was performed prior to secondary antibody staining. Secondary antibody staining was carried out using Alexa fluor 488 conjugate anti-rabbit IgG (1:1000, Cat. No. A32731, thermos Fisher) for an hour. Nuclear staining was carried out using Hoechst 33342 solution at 1mg/ml. Imaging of immunofluorescence was done using the ImageXpress Micro (Molecular Devices) at 10x magnification and 4 field-of-view per well. Fluorescence intensity within the nucleus were quantified using CellProfiler79. All analysis and curve fitting were carried out using Prism with DMSO as a baseline. ER degradation experiments were performed in three biological replicates with the same source compounds.

Cell proliferation

Cells were grown and seeded in conditions as described above. Cells were seeded in 384well plates (Cat. No. 353963, Corning) at 1000 cells per well for Halo-ER U2OS, 1200 cells for SKBR3 and 1800 cells for MCF7 and T47d. Cells were grown overnight, then treated with compounds the following day. Compound concentration and administration are the same as described previously for the immunofluorescence assay. Plates are scanned in the IncuCyte live cell analysis system (Sartorius) at 24-hour intervals for a total of 5 days using phase contrast. Cell proliferation quantification was carried out by the built-in analysis function using whole well confluency mask. All analysis and curve fitting were carried out using Prism with DMSO as a baseline. MCF7 and T47d cell proliferation experiments were performed in four biological replicates with the same source compounds.