1. Neuroscience
Download icon

Salient experiences are represented by unique transcriptional signatures in the mouse brain

  1. Diptendu Mukherjee
  2. Bogna Marta Ignatowska-Jankowska
  3. Eyal Itskovits
  4. Ben Jerry Gonzales
  5. Hagit Turm
  6. Liz Izakson
  7. Doron Haritan
  8. Noa Bleistein
  9. Chen Cohen
  10. Ido Amit
  11. Tal Shay
  12. Brad Grueter
  13. Alon Zaslaver
  14. Ami Citri  Is a corresponding author
  1. The Hebrew University, Israel
  2. Weizmann Institute of Science, Israel
  3. Ben-Gurion University of the Negev, Israel
  4. Vanderbilt University School of Medicine, United States
  5. Canadian Institute for Advanced Research, Canada
Research Article
  • Cited 2
  • Views 3,844
  • Annotations
Cite as: eLife 2018;7:e31220 doi: 10.7554/eLife.31220

Abstract

It is well established that inducible transcription is essential for the consolidation of salient experiences into long-term memory. However, whether inducible transcription relays information about the identity and affective attributes of the experience being encoded, has not been explored. To this end, we analyzed transcription induced by a variety of rewarding and aversive experiences, across multiple brain regions. Our results describe the existence of robust transcriptional signatures uniquely representing distinct experiences, enabling near-perfect decoding of recent experiences. Furthermore, experiences with shared attributes display commonalities in their transcriptional signatures, exemplified in the representation of valence, habituation and reinforcement. This study introduces the concept of a neural transcriptional code, which represents the encoding of experiences in the mouse brain. This code is comprised of distinct transcriptional signatures that correlate to attributes of the experiences that are being committed to long-term memory.

https://doi.org/10.7554/eLife.31220.001

eLife digest

Can we tell what important event a mouse – or even a person – has recently experienced? The current experience of an individual can be inferred from brain imaging experiments. However, along with changing brain activity, such an experience also switches on gene activity throughout the brain. This enables neurons to produce the proteins required to form a long-term memory of the experience.

Do distinct, memorable experiences trigger unique signatures of gene activity? To answer this question, Mukherjee, Ignatowska-Jankowska, Itskovits et al. exposed mice to a variety of experiences. Some were unpleasant and induced aversion; for example, the mouse may have felt nauseous or experienced brief pain and fear. Other experiences, such as when the mouse drank sugary water, received food or was injected with cocaine, were rewarding.

Each of the experiences led to the activation of unique combinations of genes in different regions of the brain. Analysing a subset of the activated genes in various brain regions led to the identification of unique and reliable gene expression signatures of experience. These signatures allowed the recent experience of mice to be decoded with nearly 100% accuracy. While these unique signatures can distinguish between recent experiences, experiences that share common features do trigger overlapping patterns of gene activation. For example, negative experiences – but not positive or neutral ones – activated similar patterns of genes in a brain region called the amygdala. In contrast, repeated rewarding experiences induced a distinct gene activity pattern that was most pronounced as increased activity in part of the brain called the frontal cortex.

These findings increase our understanding of how the brain represents information. The approach described in the paper provides a strategy to measure the changes in the brain that occur when information is encoded for long-term storage. This measure could also be useful during drug development, revealing how new drug compounds affect the brain, as well as providing an objective measure of the subjective experience of an individual. For example, substances that trigger similar patterns of gene activation to addictive drugs may themselves be addictive. On the other hand, substances that induce similar activity patterns to known medications could also have similar therapeutic properties.

https://doi.org/10.7554/eLife.31220.002

Introduction

Neuronal plasticity enables cognitive and behavioral flexibility underlying the development of adaptive behaviors (Alberini, 2009; Alberini and Kandel, 2015). This neuroplasticity, induced by salient experiences, has been shown to depend on the induction of temporally-defined waves of transcription (Alberini, 2009; Alberini and Kandel, 2015; McClung and Nestler, 2008; Flavell and Greenberg, 2008; West and Greenberg, 2011). The earliest of these waves consists of the expression of immediate-early genes (IEGs). IEGs have been conventionally treated as molecular markers for labeling neuronal populations that undergo plastic changes during the formation of long-term memory (Cruz et al., 2013; Minatohara et al., 2015). However, the literature indicates a much more significant contribution of IEGs to synaptic plasticity and memory formation (Lanahan and Worley, 1998; Okuno, 2011). It has been proposed that IEG transcription may represent the molecular signatures of long-term plastic changes underlying the formation of memory (Alberini, 2009). Thus, induced IEG transcription could represent an experience-specific neural code for long-term storage of information. The existence of a neural code embedded in transcription implies that it should be possible to decode the identity of recent experiences, and potentially derive information regarding the nature of the experience, from its transcriptional representation (Stanley, 2013).

To address the existence of a neural transcriptional code, we performed a detailed analysis of IEG transcription for 13 different experiences: cocaine (acute, repeated and challenge), volitional sucrose drinking (acute and repeated), reinstatement of feeding following food deprivation, lithium chloride administration (LiCl; acute and repeated), saline (acute injection without habituation, acute injection after habituation and repeated administration), acute administration of a mild foot shock, and exposure to a novel chamber with no foot shock. The experiences were selected to enable the identification of the transcriptional representations of affective attributes, such as salience and valence (Russell, 1980; Posner et al., 2005). As such, we chose to investigate experiences that drive robust positive or negative reinforcement. Repetition of rewarding and aversive experiences provided insight into the transcriptional representation of habituation to negative stimuli and positive reinforcement of rewarding experiences.

Experiences included in this study have been previously studied using electrophysiological measures, and plasticity has been observed within individual limbic and mesolimbic brain structures (Russo and Nestler, 2013). In contrast to classic electrophysiological measurements of plasticity, which focus on measurements with synapse specificity, transcriptional analysis enables parallel investigation of the representation of experience across multiple brain structures. Assuming that the encoding of complex reinforced experiences involves coordinated neural plasticity in multiple brain regions, we analyzed transcription across structures associated with the limbic and mesolimbic systems (Russo and Nestler, 2013; Haber and Knutson, 2010). The brain structures that were analyzed include limbic cortex (LCtx; including medial prefrontal cortex and anterior cingulate cortex), nucleus accumbens (NAc), dorsal striatum (DS), amygdala (Amy), lateral hypothalamus (LH), dorsal hippocampus (Hipp) and ventral tegmental area (VTA).

Our results demonstrate that the transcriptional representations of experience are robust, reliable and consistent, enabling the decoding of the recent experience of mice with high levels of accuracy from a minimal transcriptional signature. We identify transcriptional hallmarks of affective attributes of experience, prominently demonstrated in the encoding of valence. Moreover, we report opposing patterns of transcriptional modulation underlying the development of habituation to experiences of negative valence, in comparison to repeated rewarding experiences associated with positive reinforcement. We conclude with a discussion of the potential implications of a neural transcriptional code.

Results

Identification of transcriptional signatures of experience

We initiated our study with the investigation of gene expression programs induced during the development of behavioral sensitization to cocaine. Cocaine sensitization is one of the most widely applied paradigms for studying mechanisms of neural plasticity, due to the robustness of the behavioral model and the detailed insight acquired into the underlying mechanisms (McClung and Nestler, 2008; Robbins et al., 2008; Hyman et al., 2006; Nestler, 2002; Robison and Nestler, 2011; Lüscher, 2016; Piechota et al., 2010). Using this paradigm, we studied the transcriptional programs induced following acute or repeated exposure to cocaine, as well as re-exposure to cocaine after a period of abstinence from repeated drug exposures (‘cocaine challenge’) (Figure 1A,B) (Robison and Nestler, 2011). We analyzed the transcription induced at 0, 1, 2, 4 hr following each of these cocaine experiences across six brain structures (LCtx, NAc, DS, Amy, LH, and Hipp; Figure 1—figure supplement 1). Transcription was analyzed with a comprehensive set of qPCR probes against putative IEGs (see Materials and methods and Supplementary file 1). Our results demonstrate that distinct cocaine experiences (acute, repeated, challenge) are characterized by robust induction of a handful of genes across the different brain structures studied, with peak induction at 1 hr following cocaine administration (Figure 1; Figure 1—figure supplement 2; transcriptional dynamics shown in Figure 1—figure supplement 3). 29 genes were induced above two fold in at least one of the six brain regions (predominantly in LCtx, NAc and DS), across the three cocaine experiences.

Figure 1 with 6 supplements see all
Transcriptional signatures representing recent experience.

(A) Schematic of experimental paradigm for cocaine sensitization. Mice were exposed to cocaine (i.p., 20 mg/kg) or saline, either acutely, repeatedly or re-exposed following abstinence (challenge), with transcriptional dynamics studied at 0, 1, 2 or 4 hr. (B) Locomotor activity of mice following acute, repeated or challenge cocaine experiences (compared to saline). Sample size: acute and repeated saline n = 4; acute cocaine n = 30; repeated cocaine n = 22; challenge cocaine n = 19 mice. Results indicate mean ± s.e.m. (C) Expression matrix of IEG induction dynamics following cocaine experiences. Individual animals are represented in columns sorted according to time points of cocaine experiences [sample numbers per time point - LCtx: limbic cortex (n = 5–11), NAc: nucleus accumbens (n = 5–12), DS: dorsal striatum (n = 5–12), Amy: amygdala (n = 3–4), LH: lateral hypothalamus (n = 2–4), Hipp: hippocampus (n = 2–4)]. Fold induction is graded from blue (low) to red (high). Genes represented were induced at least 2-fold over control in any one of the brain regions studied. Genes were sorted according to peak induction in the DS. (D) Schematic of experimental paradigm for LiCl exposure. Mice were exposed to LiCl (i.p.) or saline, either acutely (150 or 250 mg/kg) or repeatedly (150 mg/kg). (E) Locomotor activity of mice following acute or repeated LiCl exposure (as in panel C). Sample size: n = 4–5. (F) Expression matrix of IEG induction dynamics following LiCl experiences (n = 4–5). Genes were sorted according to peak induction in the Amy. (G) Venn diagram indicating the identity of the most robustly induced genes common to cocaine and LiCl experiences (most appearances and lowest variance). (H) Confusion matrix representing the classification accuracy of decoding the recent experience (acute, chronic and challenge cocaine, acute and repeated LiCl and acute and repeated saline) of individual mice based on expression of Arc, Egr2, Egr4, Fos and Fosb induction in the LCtx, NAc, DS, Amy and LH using a KNN classifier. Accuracy is scaled from blue to green, with bright green corresponding to 100% accuracy (n = 37 mice). Overall accuracy = 97.3%. (I) Results of a permutation test for verifying classification. A randomization test was performed, in which the classifier was run on 105 random permutations of the association of individual mice to the appropriate experience, and the frequency of classification accuracies is plotted in grey, while the red dotted line represents the classification accuracy obtained for non-randomized data (97.3%).

https://doi.org/10.7554/eLife.31220.003

We were next interested in comparing the transcription programs induced by cocaine with those induced by an experience of opposite valence, within the same experimental context. For this purpose, we performed acute, as well as repeated, administration of the pharmacological compound LiCl, which induces malaise and decreases locomotion (Figure 1D,E)(Fortin et al., 2016). Similar to cocaine, LiCl drove robust induction of a small subset of IEGs (Figure 1F). In the case of LiCl experiences, 30 genes were induced above 2-fold, predominantly in the LCtx, Amy and LH (Figure 1, Figure 1—figure supplement 4; two doses of LiCl (150/250 mg/kg) induced transcriptional responses of similar magnitude - Figure 1—figure supplement 4). As a reference for the transcription induced by cocaine and LiCl experiences, we characterized the transcription induced by saline in control animals (before and after habituation, as well as following repeated exposure; Figure 1—figure supplements 5 and 6). Cocaine and LiCl experiences shared a common core set of 16 genes (Arc, Atf3, Cyr61, Dusp1, Egr2, Egr4, Elovl1, Enpp6, Fos, Fosb, JunB, Ier2, Ier5, Nr4a1, Ngfr and Npas4) of which we selected five genes for further investigation. Marker gene selection was performed by ranking genes according to the frequency of their induction (i.e. # of appearances above two-fold induction from 30 possible appearances across 6 structures in five experiences), as well as ranking in inverse order of the average variance (S2) of the magnitude of induction. The five genes with combined highest ranking in frequency of appearance and lowest variance in expression were selected for further analysis (Arc, Egr2, Egr4, Fos and Fosb; Figure 1G).

To test our hypothesis that experiences can be decoded from patterns of induced transcription, mice were classified based on the induction of five genes (Arc, Egr2, Egr4, Fos and Fosb) across five brain structures (LCtx, NAc, DS, Amy and LH), defining 25 gene-structure ‘features’. Classification performed according to these 25 features using the k-Nearest Neighbors algorithm (KNN) allowed precise allocation of individual animals based on the identity of the recent experience with 97.3% accuracy, such that only one mouse (out of 37) was incorrectly classified (Figure 1H).

Taken together, these results suggest that induced transcriptional signatures, defined by the combinatorial expression of minimal subsets of IEGs across brain structures, can be derived from comprehensive gene expression programs induced following an experience. Moreover, these minimal subsets are sufficient to decode the recent salient experience of mice.

Distinct experiences are represented by unique transcriptional signatures

To further address the existence of a transcriptional code for experience, we expanded our study, including naturalistic volitional experiences of positive valence – sucrose consumption and reinstatement of feeding, as well as foot shock, an additional experience of negative valence. To provide a birds-eye view of the transcriptional landscape, we represent the experience-specific transcriptional signatures induced by each of these experiences using radar plots (Figure 2). This representation enables immediate identification of the major transcriptional attributes of each of the experiences. Four genes (Arc, Egr2, Egr4 and Fos) are shown for simplicity of presentation; for individual mice, see Figure 2—figure supplement 1. For full data, see Supplementary file 2.

Figure 2 with 5 supplements see all
Salient experiences are represented by unique transcriptional signatures.

(A) Schematic of experimental paradigms. Experiences analyzed include saline (acute and repeated); foot shock (acute shock and no-shock controls exposed to the same environment); LiCl (acute and repeated); cocaine (acute, repeated and challenge following abstinence); sucrose (acute and repeated) and reinstatement of feeding (following 18 hr of deprivation). (B) Radar plots representing the transcriptional induction of Arc, Egr2, Egr4 and Fos across seven brain structures 1 hr after the different experiences [LCtx: limbic cortex (n = 4–14), NAc: nucleus accumbens (n = 4–14), DS: dorsal striatum (n = 4–14), Amy: amygdala (n = 4–9), LH: lateral hypothalamus (n = 3–9), Hipp: hippocampus (n = 4–9); VTA: ventral tegmental area (n = 2–8)]. Results are shown in log2 scale as mean ± s.e.m. of induction over baseline control.

https://doi.org/10.7554/eLife.31220.010

This presentation further highlights the unique nature of the transcriptional signatures characterizing each experience, and the dynamic changes in IEG induction following repeated experience. Furthermore, commonalities in the transcriptional representation of experiences with shared affective attributes are visually apparent in this presentation.

Transcriptional representation of positive and negative valence

To investigate the transcriptional representation of negative valence, we focused on the aversive experiences induced either pharmacologically by LiCl administration, or by acute administration of mild foot shock. It is worth noting that while LiCl and foot shock are both characterized by negative valence, they are otherwise distinct; LiCl causes visceral discomfort and reduced locomotion (Fortin et al., 2016), while foot shock induces acute pain and fear (Bali and Jaggi, 2015). Interestingly, exposure to the experimental context (a 18 × 20 cm perspex chamber with a metal grid floor) was by itself sufficient to induce IEG transcription across multiple structures in naïve mice (‘no shock’ control; Figure 2). Mice that received a foot shock within this context displayed an indistinguishable pattern of transcriptional induction compared to their ‘no shock’ controls (Figure 2—figure supplement 2, Supplementary file 3 - T4), with the sole distinction being a robust induction of transcription in the Amy (primarily of Egr2 and Egr4; Figure 2, Figure 2—figure supplement 2, Statistics Supplementary file 3 – T4, Row 4 – Columns B, C). This result demonstrates transcriptional coding of negative valence in the Amy, induced by the addition of a single variable (foot shock) to the experience of exposure to a novel environment. This observation was supported by the transcriptional representation of acute LiCl, which drove induction of Arc, Egr2 and Fos in the Amy (Figure 2, Figure 2—figure supplement 3, Statistics Supplementary file 3 – T3, Row 4 – Columns A, B, D).

In contrast to the experiences of negative valence, the rewarding experiences of cocaine, sucrose and feeding had a broader representation across brain structures, which was most obvious in the case of feeding, where significant gene induction was observed across all structures studied (Figure 2—figure supplements 4 and 5; Statistics Supplementary file 3 – T6). The representation of acute cocaine was primarily observed in striatal regions (DS and NAc) and small but significant changes were also observed in additional mesocorticolimbic structures (VTA, Amy, LCtx; Figure 2—figure supplement 3; Statistics Supplementary file 3 – T2), while the representation of acute sucrose was quite minimal, and was reinforced upon additional exposure, as discussed below (Figure 2—figure supplement 3; Statistics Supplementary file 3 – T5).

Opposing trajectories of the representation of repeated positive and negative experiences

Repetition of aversive or rewarding experiences drove opposing trajectories of IEG induction (Figure 2; Figure 2—figure supplement 3). Following repeated exposure to LiCl, we observed a significantly diminished transcriptional representation in the Amy, to levels similar to those observed following repeated saline experience [interaction of treatment (LiCl vs saline) and time (acute vs repeated); Egr2: F(1,18) = 8.47, p<0.01; Fos: F(1,20) = 17.2, p=0.001, Arc: F(1,20) = 8.72, p<0.01] (Statistics Supplementary file 3 – T3 – row 4). In contrast, repeated exposure to cocaine administration was associated with enhanced transcriptional induction in the LCtx, DS, and VTA (Statistics Supplementary file 3 – T2 – rows 1,3,7). This enhancement was characterized by the significant induction of Egr2 in the LCtx and DS and Fos in the LCtx, DS, VTA [interaction of treatment (cocaine vs saline) and time (acute vs repeated); Egr2: LCtx F(1,29) = 6.43, p<0.05; DS F(1,29) = 4.58, p<0.05; Fos: LCtx F(1,29) = 5.35, p<0.05; DS F(1,29) = 4.21, p<0.05, VTA F(1,13) = 14.3, p<0.01] (Statistics Supplementary file 3 – T2 – rows 1,3,7 columns B,D). However, in the NAc, the initially robust induction of Egr2 transcription following acute cocaine decreased after repeated administration (interaction of treatment and time, Egr2: F (1, 28)=39.7, p<0.0001) (Figure 2, Figure 2—figure supplement 3; Statistics Supplementary file 3 – T2 – Column B row 2).

Repeated exposure to sugar was also represented by significantly enhanced transcription, most prominently in the LCtx [interaction of sucrose (sucrose vs water) and time (acute vs repeated); Egr2: F(1,26) = 5.02, p<0.05, Fos: F(1,26) = 7.51, p=0.01; Arc: F(1,26) = 6.79, p<0.05] (Figure 2, Figure 2—figure supplement 3; Statistics Supplementary file 3 – T5 – row 1, columns A,B,D). Furthermore, reinstatement of feeding was also represented by significant induction of IEGs in the LCtx, specifically Egr2 and Fos (Egr2: F(2,28) = 13.1, p<0.0001; Fos: F(2,31) = 41.5, p<0.0001) (Figure 2, Figure 2—figure supplements 4 and 5; Statistics Supplementary file 3 – T6 – row 1, columns B,D). The experiences of repeated cocaine, repeated sucrose and reinstatement of feeding, though quite diverse in many affective and cognitive aspects, are all characterized by positive valence and therefore positive reinforcement. Our results suggest that a hallmark of increasing salience of positively reinforcing experiences may be increased transcriptional representation, specifically in the LCtx (Robinson and Berridge, 2008). This transcriptional representation of positively reinforcing experiences contrasts with the diminished transcriptional representation associated with habituation to anticipated and unavoidable aversive experiences.

Decoding recent experiences of individual mice from minimal transcription

Finally, we tested our capacity to decode the recent experience of mice on the full complement of experiences studied in this project. The transcriptional induction of five genes (Arc, Egr2, Egr4, Fos and Fosb) across five structures (LCtx, NAc, DS, Amy, LH) forms 25 gene-structure 'features’, which were used for the decoding with the KNN algorithm. We found that these 25 features supported the decoding of the recent experience of individual mice with 90.7% efficiency (Figure 3A). Random shuffling of the association of mice to experiences demonstrated the reliability of the classifier, and the potential for our results to generalize beyond the given dataset (p<1e−5; Figure 3B). These results suggest that obtaining a reliable transcriptional representation of a recent experience requires knowledge regarding both the transcriptional induction of several genes and the identity of structures within which they are induced. To further test this hypothesis, we ran a number of permutations. We tested the capacity to decode recent experiences following averaging the data for each gene across the five tested structures (losing spatial information; Figure 3—figure supplement 1; classification accuracy 55%), as well as decoding by individual structures (the expression of 5 genes in a single structure; Figure 3C; classification accuracies 33–56%) or individual genes (the expression of a single gene across five structures; Figure 3D; classification accuracies 37–70%). Taken together, while we find that measurement of the expression of individual genes, such as Fos and Egr2, across the five brain structures can support classification (67%, 70% respectively), the prediction is significantly improved by the measurement of multiple features (Figure 3A).

Figure 3 with 2 supplements see all
Decoding the recent experience of individual mice from minimal transcriptional signatures.

(A) Confusion matrix representing the classification accuracy (90.7%) of decoding the recent experience of individual mice based on 25 features. Efficiency is scaled from blue to green, with bright green corresponding to 100% efficiency (n = 54 mice). (B) Verification of classification validity. A randomization test was performed, in which the classifier was run on 105 random permutations of the association of individual mice to the appropriate experience, and the frequency of classification accuracies is plotted in grey, while the red dotted line represents the classification accuracy obtained for non-randomized data (90.7%). (C, D) Confusion matrices representing the classification accuracy of decoding utilizing transcriptional measurements from individual brain structures (five genes in one structure, (C) or individual genes (single genes across five structures, (D). Dots represent the identity of the experience, color-coded according to A. X and Y axes denote the actual and predicted conditions. (E) A decision tree enabling the classification of mice according to experience by minimal gene expression (one of many possible trees which can equivalently segregate the data). Mice are classified based on features that enable maximal segregation at each internal node. The thresholds define the allocation of mice to the left branch of the tree at each bifurcation. Mice are color-coded according to experience.

https://doi.org/10.7554/eLife.31220.016

With the objective of identifying the individual features that provide maximal support for decoding, we performed Random K-Nearest Neighbor (R-KNN) feature selection (Figure 3—figure supplement 2A) (Li et al., 2011). We identified that a combination of eight features (expression of Egr2 and Fos in the LCtx, NAc and Amy, and expression of Egr2 and Fosb in the DS) provided the highest support, with a decoding efficiency of 93.6% (Figure 3—figure supplement 2B,C). An independent approach for feature selection (Breiman Random Forest [Breiman, 2001]) identified a largely overlapping set of features, with the top 10 features supporting a classification accuracy of 94.4% (p<1e−5; Figure 3—figure supplement 2E–H). An intuitive representation of the divergence of experiences based on particular features is provided by a decision tree (one of a number of possible trees), in which mice were assigned to appropriate branches according to the extent of induction of a particular gene in a given structure (Figure 3E).

Taken together, these results establish that a minimal set of transcriptional markers form representative signatures of recent experience, enabling precise decoding of recent salient experiences at the resolution of individual mice.

Discussion

The brain creates representations of the world, encoding salient information for long-term storage to support the development of adaptive behaviors. In real time, the representation of information has been shown to be correlated with neural activity in distinct brain structures (Bialek et al., 1991). Powerful demonstrations of the potential to decode sensory experiences and correlates of emotional state have been made in both rodents and humans from neural activation patterns using in-vivo electrophysiology, fMRI, and other imaging techniques (Horikawa et al., 2013; Santoro et al., 2017; Kragel et al., 2016; Lin et al., 2005; Reber et al., 2002). In this study we demonstrate that multiplexed IEG expression data from multiple regions of the mouse brain enables the decoding of recent salient experiences with high precision. We show that beyond mere ‘activity markers’ for labeling neurons activated during an experience, IEG expression provides a quantitative and scalable metric, representing a neural transcriptional code for recent experience. This neural transcriptional code is defined by the combinatorial expression of marker transcripts across brain regions. Interestingly, we find components of induced transcriptional signatures that are associated with affective attributes of the experiences that are being encoded. Moreover, these IEG expression patterns are modulated following repeated administration of a stimulus of positive or negative value, suggesting a role for inducible transcription in sustaining long-term plasticity underlying the development of adaptive behavior. As this code is comprised of molecular components, it also provides a rich resource for biological insight into the processes underlying the long-term encoding of experience-dependent plasticity.

Transcriptional markers have been successfully utilized for the classification of developmental stages (Matcovitch-Natan et al., 2016), diseases (Lamb, 2007; McKinney et al., 2010), and many other aspects of contemporary biomedical science (Collins and Varmus, 2015). Here we describe the utility of transcriptional markers for classification of salient experiences characterized by diverse affective properties. While the information embedded in the expression pattern of a single gene is not sufficient, a minimal subset of transcriptional markers enable the decoding of recent experience with high accuracy. Importantly, the principles we identify likely generalize to a broader set of experiences. Furthermore, it is likely that markers we utilize in our study could be substituted by other markers genes, providing similar classification accuracy.

According to the Russell circumplex model (Russell, 1980; Posner et al., 2005), affect can be defined in two dimensions – valence and salience. Valence has been suggested to be encoded in the Amy, PFC, NAc and VTA (Namburi et al., 2016). Our results demonstrate that experiences of negative valence are represented by a distinct transcriptional induction in the Amy. In contrast, experiences of positive valence induce transcription in the LCtx, NAc, DS and VTA. Moreover, we report that upon repetition, the transcriptional representation within these structures is dynamically modulated, potentially underlying long-term adaptations following positive and negative reinforcement. Taken together, our results suggest that inducible transcription is a rich resource for the identification of brain regions that encode properties of an experience, providing biological insight into the molecular processes underlying experience-dependent plasticity. It should be noted that in this study we focused our analysis on structures associated with limbic and mesolimbic system. It is highly likely that transcriptional signatures across other brain areas (as well as for other experiences) would be related to different attributes of the experience, besides affect or valence.

To explain how changes in transcription could affect future behavior, we introduce the concept of ‘predictive transcriptional coding’. Predictive transcriptional coding frames inducible transcription not as a reporter of a recent event, but rather as encoding the valuation of the experience. This experience-dependent plasticity, mediated by transcription, sets the state of the network in the context of a particular experience, priming it for prospective network plasticity, and adjusting the response of the individual to the occurrence of a similar event in the future. This notion is conceptually similar to the ‘reward prediction error’ (Schultz, 2010), but is established on prolonged time scales. In this respect, transcription also serves as a ‘salience filter’ – defining whether an experience is significant enough to induce plasticity and worthy of encoding for long-term storage. Thus, the valuation of an experience that passes the ‘salience filter’ is encoded by the identity of the neural circuits recruited by the experience and the magnitude of transcription induced within them. A crucial question arising from this concept is: how is the threshold to commit to induction of transcription determined in neurons and neural networks? One possibility, worthy of future investigation, was proposed in a landmark treatise, in which the analogy of a ‘genomic action potential’ was drawn for mechanisms underlying inducible transcription (Clayton, 2000). According to this hypothesis, the threshold for commitment to transcription depends on the coincidence of glutamatergic and neuromodulatory inputs.

Our work provides a numerical definition of the imprint of recent experience, demonstrating a quantitative and predictive approach for the analysis of neural plasticity underlying adaptive behavior. Quantitative definitions of interoceptive states are expected to have implications for drug development - providing objective metrics for comprehensive characterization of the perception and valuation ascribed to an experience by individual subjects. For example, in the context of abuse liability, an objective quantitative interoceptive metric of the hedonic potential of a compound could increase standardization, reducing the reliance on variable behavioral outcomes.

While there is substantial investment being made in the development of methodologies for transcriptional profiling with deeper coverage and increasing spatial resolution, our study demonstrates that fundamental phenomena can be identified by applying simple methods with low spatial resolution and coverage. Future work, applying tools of higher resolution, could build on our observations to address additional questions – such as the spatial distribution of neuronal ensembles recruited by experience and the identity of cell types recruited by distinct experiences.

Approaches for non-invasive quantitative measurement of the encoding of experience can be envisioned, utilizing fluorescent markers of inducible transcription in combination with whole-brain imaging (Eguchi and Yamaguchi, 2009). New technologies are rapidly emerging for whole-brain analyses of transcription (Renier et al., 2016; Sylwestrak et al., 2016; Ye et al., 2016), as are strategies for comprehensive profiling of single neurons (Citri et al., 2011; Lacar et al., 2016). These technological developments, together with the novel concept we develop here, are expected to provide the foundation for a new area of neuroscience research. This discipline, of ‘Behavioral Transcriptomics’, will apply transcriptional analysis for investigation of intricate mechanisms of neural circuit plasticity underlying cognition. We propose that the approach of behavioral transcriptomics will provide a systems-level view of the encoding of experiences to long-term memory. One could speculate that different attributes of an experience may be mediated by activation of defined signaling pathways at different cellular locations, each inducing a component of the transcriptional program. If so, taken to its extreme, deciphering this transcriptional code will enable precise decoding of synapse-specific plasticity from quantitative analysis of inducible transcriptional markers.

Materials and methods

Animals

Male C57BL/6 mice aged 6–8 weeks (Harlan Laboratories, Jerusalem, Israel) served as subjects for the study. Mouse body mass ranged from 18 to 35 g, while between experimental groups in each repetition of experiments, the difference in body mass between animals did not exceed four grams. Four to five mice were housed per cage in all experiments except for sucrose consumption experiments, for which animals were single-housed. Mice were maintained in 12–12 hr light/dark cycle (0700 on/1900 off), in a temperature (20–22°C) and humidity (55 ± 10%) controlled facility. Mice received ad libitum access to water and food, with the exception of the experiment studying reinstatement of feeding, in which they were food deprived for 18 hr before reinstatement of feeding. Mice were randomly assigned to experimental groups and tested according to Latin square design. All tests were conducted during the light phase of the circadian cycle. Each experiment was performed at least twice, by independent researchers in the group, and provided similar results. All animal protocols were approved by the Institutional Animal Care and Use Committees at the Hebrew University of Jerusalem and were in accordance with the National Institutes of Health Guide for the Care and Use of Laboratory Animals. A table defining the number of mice (‘n’) contributing to each experiment is included as Supplementary file 2.

Behavioral assays

Mice were acclimated to the animal facility for at least 2–5 days, followed by 3–4 days of experimenter handling, before the start of an experiment. Maintenance of uniform conditions across experiments and extensive handling were essential for reducing experimental variability, enabling the identification of a robust transcriptional response specifically induced by the experience being tested and minimal contamination from contextual background. Behavioral sensitization to cocaine. Mice were subjected to three days of intraperitoneal (i.p.) saline injections (250 microliter/injection), prior to exposure to cocaine (20 mg/kg freshly dissolved in physiological saline to 2 mg/ml and injected at a volume of 10 ml/kg; cocaine was obtained from the pharmacy at Hadassah Hospital, Jerusalem). The acute cocaine group received a single i.p. dose of cocaine, followed by analysis of locomotor behavior for 15 min in a video-monitored open-field arena. Animals were finally taken from their home cage and sacrificed at 1, 2 and 4 hr following the cocaine injection. The repeated cocaine group received five consecutive daily injections of cocaine and were studied (similar to the acute cocaine group) following the fifth cocaine injection. The challenge cocaine group were treated as the repeated cocaine group, and then made abstinent from cocaine for 21–22 days, following which they were challenged with cocaine and re-exposed to the open-field arena. All responses were normalized to baseline controls (time 0), which were interleaved with their peer group, but were not treated on the day of the experiment. Additional reference groups included acute saline without habituation, which were habituated to the open-field arena for three days after a brief period of handling, and were sacrificed 1 hr following a single injection of saline. Responses in this group were normalized to controls (time 0), which were not exposed to any saline injections. The group of acute saline without habituation served as a reference for the habituation of the acute saline group, in which animals were treated identically to the acute cocaine group (i.e. three consecutive days of habituation to saline injections in the open-field arena), but received a saline injection on the day of the experiment. Following each i.p. injection, mice were placed in an open-field arena for 20 min, during which locomotion was assayed between minutes 2 to 17. LiCl exposure. All mice were habituated to injections of saline and locomotor monitoring in an open-field arena for three days preceding onset of the experiment. Animals were subjected to either acute or repeated administration of LiCl (Sigma-Aldrich, St.Louis, MO, USA). In acute LiCl experiments, mice were administered with either a single dose of LiCl (at 150 or 250 mg/kg) or saline. In the experiments testing repeated LiCl, mice received LiCl (150 mg/kg) for five consecutive days, and following a 48 hr break were re-exposed to LiCl or saline. Mice were divided into four groups: a) Received saline injections for five days and were not exposed to an injection on the last day (saline-0h), b) Received LiCl injections for five days and were not exposed to an injection on the last day (LiCl-0h), c) Received saline injections for five days and were subjected to saline injection on the last day (repeated saline), d) Received LiCl injections for five days and were exposed to LiCl injection on the last day (repeated LiCl). In all experiments, immediately following administration of LiCl or saline, mice were placed in video-monitored open-field arenas for 30 min. Reinstatement of feeding. Mice were food deprived for 18 hr before the experiment and then re-exposed to food for 1, 2 or 4 hr before they were sacrificed. Control animals (0 hr) were sacrificed immediately after the 18 hr food restriction. An additional reference group was allowed to continuously feed. Sucrose Consumption. Mice were single-housed for at least seven days before the experiment and habituated to the addition of a second water bottle in the cage for three days before the onset of the experiment. Acute exposure to sucrose was tested by habituating mice to the bottle with 10% sucrose overnight (16 hr), and 48 hr later, re-exposing the mice to a bottle with sucrose or water (control) for 1 hr. Repeated exposure to sucrose was tested by exposing mice to sucrose repeatedly for eight consecutive days, 2 hr each day (12:00-14:00), and after a 48 hr break, re-exposed to sucrose or water (control) for 1 hr. Mice were sacrificed 1 hr following the exposure to sucrose. Sucrose and water intake were measured as a test for sucrose preference over water. Foot Shock. Following habituation to the experimental setup, the mice were placed in the experimental chamber (20 × 18 cm) for three minutes, during which time, baseline freezing behavior was measured. At three minutes, each subject received three mild foot shocks (2 s, 0.7 mA) separated by 30 s interval and post-shock freezing behavior was assessed immediately thereafter for 30 s before return to the home cage. Freezing, defined as a lack of movement other than respiration, was measured using Ethovision software (Noldus, Wageningen, The Netherlands).

Locomotor activity measurement

Locomotor activity was assessed in sound- and light-attenuated open-field chambers. Mice were placed individually in a clear, dimly lit Plexiglas box (30 × 30 × 30 cm) immediately after injection of cocaine, LiCl or saline. Activity was monitored with an overhead video camera for 20 or 30 min (in cocaine sensitization and LiCl experiments respectively) using Ethovision software (Noldus, Wageningen, The Netherlands).

Dissections

Performed as previously described (Turm et al., 2014). Mice were deeply anesthetized with Isoflurane (Piramal Critical Care, Bethlehem, PA, USA) and euthanized by cervical dislocation, followed by rapid decapitation and harvesting of brains into ice cold artificial cerebrospinal fluid (ACSF) solution (204 mM sucrose, 26 mM NaHCO3, 10 mM glucose, 2.5 mM KCl, 1 mM NaH2PO4, 4 mM MgSO4 and 1 mM CaCl2; all from Sigma-Aldrich, St. Louis, MO). Coronal slices (400 µm) were cut on a vibrating microtome 7000 smz2 (Camden Instruments, Loughborough, UK) in ice-cold artificial cerebrospinal fluid (ACSF). Brain regions [Limbic cortex (LCtx), Nucleus Accumbens (NAc), Dorsal Striatum (DS), Amygdala (Amy), Lateral Hypothalamus (LH), Hippocampus (Hipp) and Ventral Tegmental Area (VTA)] were dissected from relevant slices under a stereoscope (Olympus, Shinjuku, Tokyo, Japan). Samples of LCtx, NAc, DS, Amy, LH AND Hipp were obtained from 2* 400 µm thick sections, while VTA, was obtained from 2* 200 µm thick sections (Figure 1—figure supplement 1). All of the steps were performed in strictly cold conditions (~4°C) and care was taken to avoid warming of the tissue sections or the ASCF at all times. The tissue pieces were immediately submerged in Tri-Reagent (Sigma-Aldrich, St.Louis, MO) and stored at −80°C until processing for RNA extraction.

Marker selection, RNA extraction, qPCR and microfluidic qPCR

The strategy for marker selection consisted of three steps. The initial list of candidate IEGs was compiled from a whole-genome microarray analysis of transcriptional dynamics induced by cocaine experiences in the nucleus accumbens (Illumina MouseRef-8 v2 Expression BeadChip microarrays; data not shown), as well as a survey of literature and databases pertaining to IEG expression. qPCR primer probes were developed for 212 genes and primer efficiency was tested, resulting in selection of 152 optimal primer pairs. Differential expression of the shortlisted IEGs was then tested on samples from multiple brain structures, dissected from mice following cocaine and LiCl experiences, utilizing microfluidic qPCR arrays. Genes that displayed at least 1.25-fold induction in any measurement were shortlisted, resulting in a list of 78 genes. The next round of feature selection involved ranking genes based on their frequency of induction and variance. For ranking based on frequency of induction, we counted the number of times each gene was induced above a threshold of two-fold induction across the different brain structures (LCtx, NAc, DS, Amy, LH and Hipp) in the cocaine (acute, repeated and challenge) and LiCl (acute and repeated) conditions (i.e. induction in six structures*five experiences = #/30). In addition, we ranked genes in inverse order of average variance (S2) of their induction across structures. The five genes that were induced most consistently (combined highest ranking in frequency and lowest in variance) were selected for further investigation. The ranking of these genes was as follows: Arc (#=22/30, S2 = 2.9), Egr2 (#=21/30, S2 = 2.8), Egr4 (#=18/30, S2 = 1.53), Fos (#=14/30, S2 = 0.43), Fosb (#=11, S2 = 0.6). Thus, criteria for marker selection were orthogonal to the tested hypothesis, supporting unbiased analysis.

RNA extraction was performed strictly in cold RNase-free conditions. Tissue was homogenized using a 25G needle attached to a 1 ml syringe or using TissueLyser LT (Qiagen, Redwood city, CA, USA). The homogenate was centrifuged at high speed (15 k g for 10 min) and the supernatant was mixed with chloroform (Bio-Lab, Jerusalem, Israel) by vigorous shaking and centrifuged (15 k g for 15 min) to separate the RNA from other nucleic acids and proteins. Isopropanol (J. T. Baker, Center Valley, PA) and glycogen (Roche, Basel, Switzerland) were added to the aqueous layer and samples were placed either at −20°C for 24 hr or at −80°C for 1 hr (producing comparable results). The samples were centrifuged at high speed (15 k g for fifteen min) for the precipitation of the RNA. The RNA was then washed in 75% ethanol (J. T. Baker, Center Valley, PA) by centrifugation (12 k g for five min), dried and dissolved in ultrapure RNase free water (Biological Industries, Kibbutz Beit Haemek, Israel). RNA concentration was measured with a NanoDrop 2000c spectrophotometer (Thermo, Wilmington, DE) and random-primed cDNA was prepared from 100 to 300 ng of RNA, with use of a High Capacity cDNA Reverse Transcription Kit (Applied Biosystems, Foster city, CA), following manufacturer guidelines.

cDNA was processed for qPCR analysis using qPCR primer pairs (IDTDNA, Coralville, IA) and SYBR Green in a Light-cycler 480 Real Time PCR Instrument (Roche Light Cycler*480 SYBR Green I Master, Roche, Basel, Switzerland) according to manufacturer guidelines. Relative levels of gene expression (ΔCt) were obtained by normalizing gene expression to a housekeeping gene (GAPDH). Fold induction was calculated using the ΔΔCt method, normalizing experimental groups to the average of a relevant control group.

Microfluidic qPCR, querying 96 samples against 96 sets of qPCR probes was performed utilizing Fluidigm Biomark Dynamic IFC (integrated fluidic circuit) Arrays (Fluidigm Corp, South San Francisco, CA). Briefly, samples are subjected to targeted preamplification to enrich for specific gene products, which were then assayed with dynamic array fluidic microchips. Sample preparation was performed according to previously published protocols (Turm et al., 2014). Targeted pre-amplification (STA) was achieved by mixing samples with a set of diluted primer pairs in TaqMan PreAmp Mastermix (Applied Biosystems; Foster City, CA, USA) followed by 10 min of denaturation at 95°C and 14 cycles of amplification (cycles of 95°C for 15 s and 60°C for 4 min). Primers were then eliminated by use of ExoI exonuclease (NEB; Ipswich, MA), placed in a thermal cycler at 37°C for 30 min and then at 80°C for 15 min. Samples were then loaded onto a primed dynamic array for qPCR in a specialized thermal cycler [Fluidigm Biomark; Thermal mixing: 70°C for 40 min, 60°C for 30 s, 95°C denaturation for 60 s, followed by 40 cycles of PCR (96°C for 5 s, 60°C for 20 s)]. For data analysis, a reference set of genes was identified, whose expression remained constant across all experimental conditions (Dkk3, Tagln3, Gars, Scrn1, Rpl36al, Mcfd2, Psma7 and Hpcla4). In order to reduce the potential for introduction of experimental error by normalization to a single gene, a 'global-normalization' Ct value was created for each sample from the average Ct values of the genes within the reference set. Fold induction was calculated using the ΔΔCt method, normalizing each gene in a sample to the global-normalization value (ΔCt), followed by normalization of the experimental groups to the average of their relevant control group.

Data analysis

All data are presented as mean ± standard error (s.e.m.). Data were analyzed using one-way or two-way analysis of variance (ANOVA), as appropriate. Tukey or Dunnett test was used for post hoc analyses of significant ANOVAs to correct for multiple comparisons. Differences were considered significant at the level of p<0.05. Statistical analysis was performed, and bar graphs and line graphs were created, with Prism 6.0 (GraphPad, San Diego, CA). Heat maps were created in MATLAB R2012a (Mathworks, Natick, MA). Radar plots were created in Origin 6.0 (Originlab, Northampton, MA). Codes were written in MATLAB R2015b (MathWorks, Natick, MA) and confusion matrices, randomization plots were created in Python using the Matplotlib library (http://matplotlib.org).

Computational analyses

The analysis was performed on data obtained from 54 mice, each of which experienced one of the experiences (acute, repeated or challenge cocaine, acute and repeated sucrose, reinstatement of feeding, acute and repeated LiCl and foot shock and no-shock controls exposed to the same environment). Each mouse was represented by a vector of twenty-five features [corresponding to the induction of five genes (Arc, Egr2, Egr4, Fos and Fosb) across five structures [limbic cortex (LCtx), nucleus accumbens (NAc), dorsal striatum (DS), amygdala (Amy) and lateral hypothalamus (LH)]. Each gene-structure combination was defined as a ‘feature’.

Supervised classification

The classifier used was k-Nearest Neighbors (KNN), with k = 1 over the Euclidean space, unless otherwise stated. This approach was selected based on the observation that the transcriptional response of mice within an experience group formed unique clusters. We evaluated the performance of our classification by a leave-one-out method. In this approach, we iterated over each sample in our training set and classification was performed given the rest of the training set. Visualization of the accuracy of classification was performed using a confusion matrix, which conveys both mean precision and mean recall of each condition classified.

Feature selection

Feature selections were performed using Random k-Nearest Neighbors (RKNN) (Li et al., 2011) or Breiman Random Forest (RF) (Breiman, 2001) algorithm. For RKNN, the contribution of each feature for classification of individual experiences was called support. We chose large (n = 1e6), random subsets of the twenty five available features in varying sizes (between one and twenty five). For each such subset we trained a classifier. Each feature f appeared in some KNN classifiers, for example, set C(f) of size M, where M is the multiplicity of f. In turn, each classifier c ∈ C(f) is an evaluator of its m features. We defined the support of a feature f as the mean accuracy of all the classifiers in C(f). Namely:

support(f)=cϵC(f)accuracy(c)M

To further examine the effect of feature set sizes on classification performance we evaluated the classification accuracy of different subset sizes in the following manner: for each case, we chose the n features which were ranked the highest in their support, and evaluated the KNN classifier trained with those features only.

For classification using Random Forest (RF), we used the Breiman random forest algorithm (Breiman, 2001; ), according to which a large number (n = 1e5) of decision trees were built, where each tree used a varying number of features (between 1 and 25). Bifurcations were chosen according to modified Gini gain. For each feature, we averaged over the decrease in the Gini gain (MDG) (Han et al., 2016) over the ensemble of decision trees. The selected features were then evaluated using a regularized (pruned) decision tree with a maximal depth of 4, using a k-cross validation process with k = 10, with the constraint of a minimum categorization of 3 animals per group. The decision tree was constructed using the CART decision tree construction algorithm (Breiman et al., 1984) (Figure 3—figure supplement 2E–H).

Decision tree

To provide an example of a descriptive classifier, we created a decision tree using the CART algorithm with Information Gain (Ben-David and Shalev-Shwartz, 2014). No constraints were applied while building this tree.

Randomization

Considering the limited size of our dataset, we wanted to ensure that the classifier was not over fitted to our training set S. For this purpose, we produced a large number N (N = 1e5) of permuted versions of our training set (si, ...sN), and created KNN or decision tree as the classifiers in the same way as for the original data. The permutation was performed by shuffling the association of individual mice with experiences. For each such permuted training set we trained a classifier and evaluated its classification accuracy (leave-one-out, see previous description). We calculated the empirical p value (p<1e−5 for both conditions) for the classification accuracy on our original training set in the following manner:

pval=1Ni=1N𝟙acc(Si)>acc(S)

Data and code availability

The data sets generated during the current study, as well as the code used for analysis have all been uploaded as supplementary material (supplementary file 14, source code 111).

References

  1. 1
  2. 2
  3. 3
  4. 4
    Understanding Machine Learning: From Theory to Algorithms
    1. S Ben-David
    2. S Shalev-Shwartz
    (2014)
    Understanding Machine Learning: From Theory to Algorithms.
  5. 5
  6. 6
    Classification and Regression Trees (The Wadsworth Statistics/Probability Series)
    1. L Breiman
    2. JH Friedman
    3. RA Olshen
    4. CJ Stone
    (1984)
    Classification and Regression Trees (The Wadsworth Statistics/Probability Series), 19.
  7. 7
  8. 8
  9. 9
  10. 10
  11. 11
  12. 12
  13. 13
  14. 14
  15. 15
  16. 16
    Variable selection using mean decrease accuracy and mean decrease gini based on random forest
    1. H Han
    2. X Guo
    3. H Yu
    (2016)
    7th IEEE Int. Conf. Softw.Eng. Serv. Sci. pp. 219–224.
  17. 17
  18. 18
  19. 19
  20. 20
  21. 21
  22. 22
  23. 23
  24. 24
  25. 25
  26. 26
  27. 27
  28. 28
  29. 29
  30. 30
  31. 31
  32. 32
  33. 33
  34. 34
  35. 35
    Neural correlates of successful encoding identified using functional magnetic resonance imaging
    1. PJ Reber
    2. RM Siwiec
    3. DR Gitelman
    4. TB Parrish
    5. MM Mesulam
    6. KA Paller
    7. DR Gitleman
    (2002)
    Journal of Neuroscience 22:9541–9548.
  36. 36
  37. 37
  38. 38
  39. 39
  40. 40
    A circumplex model of affect
    1. JA Russell
    (1980)
    Journal of Personality and Social Psychology 39:1161–1178.
    https://doi.org/10.1037/h0077714
  41. 41
  42. 42
  43. 43
  44. 44
  45. 45
  46. 46
    Comprehensive analysis of transcription dynamics from brain samples following behavioral experience
    1. H Turm
    2. D Mukherjee
    3. D Haritan
    4. M Tahor
    5. A Citri
    (2014)
    Journal of Visualized Experiments, 10.3791/51642, 25225819.
  47. 47
  48. 48

Decision letter

  1. Sacha B Nelson
    Reviewing Editor; Brandeis University, United States

In the interests of transparency, eLife includes the editorial decision letter and accompanying author responses. A lightly edited version of the letter sent to the authors after peer review is shown, indicating the most substantive concerns; minor comments are not usually included.

Thank you for submitting your article "Salient Experiences are Represented by Unique Transcriptional Signatures in the Brain" for consideration by eLife. Your article has been favorably evaluated by Aviv Regev (Senior Editor) and three reviewers, one of whom, Sacha Nelson, is a member of our Board of Reviewing Editors. The following individual involved in review of your submission has agreed to reveal their identity: Pavel Osten (Reviewer #2).

The reviewers have discussed the reviews with one another and the Reviewing Editor has drafted this decision to help you prepare a revised submission.

Summary:

The authors use real-time PCR from dissected brain regions to assess the immediate early gene (IEG) transcriptional responses to a variety of rewarding and aversive experiences including drugs of abuse, feeding, foot shock and gastric distress. Results show specific transcriptional signatures for each experience, enabling decoding of the experience. Moreover, transcriptional codes appeared to primarily reflect the valence of the event, with experiences of neutral or negative valence showing opposing patterns of transcriptional activity. This is an interesting and novel approach akin to efforts to decode experience from brain imaging and neural recordings.

Essential revisions:

1) Increasing the rigor of Feature Selection: There is a fundamental issue with the number of features considered in the analyses and how they were selected from what was measured. Initially, 152 IEGs are measured in 7 brain areas (=1064 features). Then, the number of IEGs is reduced to 78 (>1.25-fold induction). From these, 5 are selected (Arc, Egr2, Egr4, Fos and Fosb), but this step is not well described or justified. Potentially, some of the reduction could be attributed to a priori assumptions based on the literature, but these should be orthogonal to the tested hypotheses (see below) and need to be justified.

In addition, some analyses only consider a smaller number of brain areas, without clear justification. All of these choices can be extremely impactful and may strongly affect the results and ultimately the conclusions drawn from the data. Most importantly, criteria for these choices need to be orthogonal to the tested hypothesis (they cannot depend on differences in IEGs between experiences). If feature selection is non-independent, subsequent analyses will be biased and produce invalid results. The authors need to describe what criteria were used for feature selection (regions and IEGs) and whether these criteria were orthogonal to comparisons among individual experiences. For example, the formal feature selection procedure conducted for results reported in Figure 3 is non-independent, resulting in biased (invalid) classification accuracies. The problem is that the same data were used to select features (estimate "support") and also to evaluate classifier performance, which is circular. Feature selection must be based on nested cross-validation within the training data only. For instance, the algorithm should start with all 1064 features (or 78 x 7 = 546 features) and use nested cross-validation to select a feature set that is then used to determine classifier accuracy in the left-out test data.

2) Controlling for multiple comparisons – this is required for all tests.

3) Statistical testing is needed for many results presented in the manuscript. No statistical tests are reported for any of the results summarized in the sentence "The representation of rewarding experiences are characterized by robust transcriptional induction in the LCtx, NAc, DS, and VTA, while the representations of aversive experiences are dominated by transcriptional induction in the Amy". There are also no tests for whether transcriptional signatures of different experiences with the same or different valence are positively or negatively correlated. All descriptive statements should be backed by appropriate statistical tests within the manuscript.

4) The authors contrast the idea of a "transcriptional code" with the idea of IEG expression as simply "molecular markers for labelling neuronal populations that undergo plastic changes." The issue of the degree to which differences in which genes are transcribed vs. where in the brain they are transcribed is not satisfyingly analyzed. Encoding that matches different experiences to different transcripts would be a transcriptional code in the sense that many might assume from use of the term. On the other hand, encoding that matches different experiences to different brain regions would be quite akin to the "molecular marker" model the authors wish to reject. In between these two extremes, and probably closer to the data is the view that different experiences activate different brain regions, but that different brain regions also have different preferred mixtures of IEGs to activate. Gene-structure pairs are treated as features, but the relative degree to which this is a "spatial" code across brain regions vs. a genetic code across genes is unclear. The authors should attempt to separate these two contributing factors to more precisely specify what kind of "neural code embedded in transcription" they are talking about. It looks from Figure 2B, for example, like there is a strong "shape similarity" across experiences within a region. This would seem to imply that relative activation of the 4 genes tested is more a function of the region than of the experience (e.g. LH and VTA have relatively little activation of Arc). On the other hand, overall magnitude is more related to the interaction between the experience and the structure (cocaine for VTA and DS; foot shock for hipp).

5) The current findings suggest that transcriptional signatures for individual experiences are primarily driven by salience and valence. However, this could simply be a consequence of the (reward-related) brain regions considered here. It is possible that transcriptional signatures in other brain areas are not related to valence, and this should be discussed.

6) Title: it is convention for eLife papers to include some reference to the preparation used. This could be achieved by including the word "rodent" or at least "mammalian" before the word "brain" in the title.

https://doi.org/10.7554/eLife.31220.036

Author response

Essential revisions:

1) Increasing the rigor of Feature Selection: There is a fundamental issue with the number of features considered in the analyses and how they were selected from what was measured. Initially, 152 IEGs are measured in 7 brain areas (=1064 features). Then, the number of IEGs is reduced to 78 (>1.25-fold induction). From these, 5 are selected (Arc, Egr2, Egr4, Fos and Fosb), but this step is not well described or justified. Potentially, some of the reduction could be attributed to a priori assumptions based on the literature, but these should be orthogonal to the tested hypotheses (see below) and need to be justified.

In addition, some analyses only consider a smaller number of brain areas, without clear justification. All of these choices can be extremely impactful and may strongly affect the results and ultimately the conclusions drawn from the data. Most importantly, criteria for these choices need to be orthogonal to the tested hypothesis (they cannot depend on differences in IEGs between experiences). If feature selection is non-independent, subsequent analyses will be biased and produce invalid results. The authors need to describe what criteria were used for feature selection (regions and IEGs) and whether these criteria were orthogonal to comparisons among individual experiences. For example, the formal feature selection procedure conducted for results reported in Figure 3 is non-independent, resulting in biased (invalid) classification accuracies. The problem is that the same data were used to select features (estimate "support") and also to evaluate classifier performance, which is circular. Feature selection must be based on nested cross-validation within the training data only. For instance, the algorithm should start with all 1064 features (or 78 x 7 = 546 features) and use nested cross-validation to select a feature set that is then used to determine classifier accuracy in the left-out test data.

We thank the reviewers for these comments, which are important and indeed require explicit clarification. Three separate issues arise, to which we relate: (1) The basis for selection of the 5 genes used for analyses throughout the paper. (2) The basis for selection of brain structures. (3) Feature selection vs classifier performance in Figure 3 – ensuring that there was no circular logic.

1) Clarification for the selection of 5 genes:

We have now clarified and elaborated on the process of marker gene selection in the text (see below). In order to assist this clarification, we have also reordered the figures, and now include the analysis of LiCl in Figure 1D-F, and add a comparison of genes induced by cocaine and LiCl, which assists in clarifying the process of feature selection. Furthermore, the text describing the process of selection of 5 genes for analyses is elaborated both in the Results as well as in the Materials and methods sections. Importantly, the criteria for selection of the 5 marker genes were orthogonal to, and independent of, the tested hypothesis, supporting unbiased analysis.

In the body of the Results section, we define the process as follows: “We analyzed the transcription induced at 0, 1, 2, 4 hours following each of these cocaine experiences across 6 brain structures (LCtx, NAc, DS, Amy, LH, and Hipp; Figure 1—figure supplement 1).[…] The 5 genes with combined highest ranking in frequency of appearance and lowest variance in expression were selected) for further analysis (Arc, Egr2, Egr4, Fos and Fosb; Figure 1G).”

Within the Materials and methods section, we further elaborate on the process of marker selection, as follows: “The strategy for marker selection consisted of three steps.[…] Thus, criteria for marker selection were orthogonal to the tested hypothesis, supporting unbiased analysis.”

2) Selection of 5 brain structures:

We chose to focus on 5 brain structures for analysis (LCtx, NAc, DS, Amy and LH). The primary reason for utilizing these structures for analysis was that these were the structures for which we obtained samples across all experiences. Furthermore, hippocampal samples demonstrated very little transcriptional induction, and samples from the VTA were the most difficult to reliably obtain, due to the small size and irregular shape of this structure. In the original manuscript we performed a classification of cocaine experiences based on 3 structures (LCtx, NAc and DS; originally Figure 1E). This analysis has been removed from the revised version of the manuscript, and replaced with a decoding of the recent experience of mice following cocaine and LiCl experiences, using the data from all 25 features (5 structures * 5 genes; Figure 1H).

3) Feature Selection vs. classifier performance:

Regarding the feature selection described in Figure 3 – ‘support’ was calculated with the primary objective of identifying the features contributing most to the classification, but the classifier does not depend on feature selection. We have now clarified this in the revised manuscript, in which we present the decoding based on all 25 features (with no selection) in the main figure (Figure 3A). The analysis of support and the identification of the most indicative features have been moved to the supplementary information (Figure 3—figure supplement 2). In addition, we have utilized an independent approach to feature selection (Breiman Random Forest), which provided a largely overlapping set of the most informative features (both approaches identify the same top 12 features, and overlap in 6 of the top 8 features). It is important to stress that the issue of feature selection is not cardinal in any way for the theme of this paper. The KNN classifier does not require feature selection to enable powerful classification of the experiences, and it is our conviction, as we have written in the Discussion, that the marker genes we have utilized could largely be substituted for a different set, without a significant loss in classification efficiency: “Furthermore, it is likely that markers we utilize in our study could be substituted by other markers genes, providing similar classification accuracy.”.

This text now reads: “Finally, we tested our capacity to decode the recent experience of mice on the full complement of experiences studied in this project. […] An independent approach for feature selection (Breiman Random Forest25) identified a largely overlapping set of features, with the top 10 features supporting a classification accuracy of 94.4% (p<1e-5; Figure 3—figure supplement 2E-H).”

Finally, the reviewers propose that we perform feature selection based on nested cross-validation within the training data using 1064 or 546 features. This would be wonderful to implement, but is not possible in our data – as we have not acquired the full dataset of 152 genes for all the experiences studied. We acquired the 78 gene dataset for a subset of the mice that underwent cocaine, LiCl and saline experiences, while the majority of mice and experiences were analyzed with the 5 gene set defined based on the results of the analysis of LiCl and cocaine experiences. Therefore, the process we developed for selecting genes for analysis was the least biased process we could perform. This process, of selecting features based on high fold induction and low variance, is orthogonal to the hypothesis that recent experiences can be decoded from a transcriptional signature and enables unbiased analysis. Furthermore, the results of our study do not depend, to any significant extent, on feature selection.

2) Controlling for multiple comparisons – this is required for all tests.

Indeed, we have controlled for multiple comparisons, as defined in the Materials and methods section: “Data analysis. All data are presented as mean ± standard error (s.e.m.). […] Codes were written in MATLAB R2015b (MathWorks, Natick, MA) and confusion matrices, randomization plots were created in Python using the Matplotlib library (http://matplotlib.org).”

3) Statistical testing is needed for many results presented in the manuscript. No statistical tests are reported for any of the results summarized in the sentence "The representation of rewarding experiences are characterized by robust transcriptional induction in the LCtx, NAc, DS, and VTA, while the representations of aversive experiences are dominated by transcriptional induction in the Amy". There are also no tests for whether transcriptional signatures of different experiences with the same or different valence are positively or negatively correlated. All descriptive statements should be backed by appropriate statistical tests within the manuscript.

We have performed a thorough statistical analysis of our data, and include a supplementary table describing this analysis (Supplementary file 3 – Statistics), to which we refer throughout the manuscript. We have also included a much more detailed description of statistical significance throughout the Results section.

4) The authors contrast the idea of a "transcriptional code" with the idea of IEG expression as simply "molecular markers for labelling neuronal populations that undergo plastic changes." The issue of the degree to which differences in which genes are transcribed vs. where in the brain they are transcribed is not satisfyingly analyzed. Encoding that matches different experiences to different transcripts would be a transcriptional code in the sense that many might assume from use of the term. On the other hand, encoding that matches different experiences to different brain regions would be quite akin to the "molecular marker" model the authors wish to reject. In between these two extremes, and probably closer to the data is the view that different experiences activate different brain regions, but that different brain regions also have different preferred mixtures of IEGs to activate. Gene-structure pairs are treated as features, but the relative degree to which this is a "spatial" code across brain regions vs. a genetic code across genes is unclear. The authors should attempt to separate these two contributing factors to more precisely specify what kind of "neural code embedded in transcription" they are talking about. It looks from Figure 2B, for example, like there is a strong "shape similarity" across experiences within a region. This would seem to imply that relative activation of the 4 genes tested is more a function of the region than of the experience (e.g. LH and VTA have relatively little activation of Arc). On the other hand, overall magnitude is more related to the interaction between the experience and the structure (cocaine for VTA and DS; foot shock for hipp).

We thank the reviewers for this insightful comment. We have performed additional analysis, addressing the decoding capacity when we relate to each structure independently (5 genes in each individual structure), genes alone (losing the reference to structures by averaging across structures) or individual genes (relating to each gene separately). This analysis reveals that while individual genes (specifically Egr2 and Fos) are strong classifiers on their own (correctly classifying 67-70% of the mice) the predictive power improves substantially when relating to multiple features, comprising of both genes and structures.

As defined in the text: “To further test this hypothesis, we ran a number of permutations. […]. Taken together, while we find that measurement of the expression of individual genes, such as Fos and Egr2, across the 5 brain structures can support classification (67%, 70% respectively), the prediction is significantly improved by the measurement of multiple features (Figure 3A).”

5) The current findings suggest that transcriptional signatures for individual experiences are primarily driven by salience and valence. However, this could simply be a consequence of the (reward-related) brain regions considered here. It is possible that transcriptional signatures in other brain areas are not related to valence, and this should be discussed.

We accept this comment, and have included a sentence to this avail in the Discussion. This sentence reads: “It should be noted that in this study we focused our analysis on structures associated with limbic and mesolimbic system. It is highly likely that transcriptional signatures across other brain areas (as well as for other experiences) would be related to different attributes of the experience, besides affect or valence.”

6) Title: it is convention for eLife papers to include some reference to the preparation used. This could be achieved by including the word "rodent" or at least "mammalian" before the word "brain" in the title.

We have included the word “mouse” in the title.

https://doi.org/10.7554/eLife.31220.037

Article and author information

Author details

  1. Diptendu Mukherjee

    Department of Biological Chemistry, Silberman Institute of Life Sciences, The Hebrew University, Jerusalem, Israel
    Contribution
    Conceptualization, Data curation, Formal analysis, Validation, Investigation, Visualization, Methodology, Writing—original draft, Writing—review and editing
    Competing interests
    No competing interests declared
  2. Bogna Marta Ignatowska-Jankowska

    The Edmond and Lily Safra Center for Brain Sciences, The Hebrew University, Jerusalem, Israel
    Contribution
    Conceptualization, Data curation, Formal analysis, Validation, Investigation, Visualization, Methodology, Writing—original draft, Writing—review and editing
    Competing interests
    No competing interests declared
    ORCID icon 0000-0002-8427-1128
  3. Eyal Itskovits

    1. Department of Genetics, Silberman Institute of Life Sciences, The Hebrew University, Jerusalem, Israel
    2. School of Computer Science and Engineering, The Hebrew University, Jerusalem, Israel
    Contribution
    Conceptualization, Software, Formal analysis, Methodology
    Competing interests
    No competing interests declared
  4. Ben Jerry Gonzales

    Department of Biological Chemistry, Silberman Institute of Life Sciences, The Hebrew University, Jerusalem, Israel
    Contribution
    Investigation, Methodology, Writing—review and editing
    Competing interests
    No competing interests declared
  5. Hagit Turm

    Department of Biological Chemistry, Silberman Institute of Life Sciences, The Hebrew University, Jerusalem, Israel
    Contribution
    Validation, Investigation, Methodology, Project administration
    Competing interests
    No competing interests declared
  6. Liz Izakson

    Department of Biological Chemistry, Silberman Institute of Life Sciences, The Hebrew University, Jerusalem, Israel
    Contribution
    Validation, Investigation, Methodology
    Competing interests
    No competing interests declared
  7. Doron Haritan

    Department of Biological Chemistry, Silberman Institute of Life Sciences, The Hebrew University, Jerusalem, Israel
    Contribution
    Validation, Investigation, Methodology
    Competing interests
    No competing interests declared
  8. Noa Bleistein

    Department of Biological Chemistry, Silberman Institute of Life Sciences, The Hebrew University, Jerusalem, Israel
    Contribution
    Validation, Investigation, Methodology
    Competing interests
    No competing interests declared
  9. Chen Cohen

    Department of Biological Chemistry, Silberman Institute of Life Sciences, The Hebrew University, Jerusalem, Israel
    Contribution
    Validation, Investigation, Methodology
    Competing interests
    No competing interests declared
  10. Ido Amit

    Department of Immunology, Weizmann Institute of Science, Rehovot, Israel
    Contribution
    Resources, Methodology
    Competing interests
    No competing interests declared
  11. Tal Shay

    Department of Life Sciences, Ben-Gurion University of the Negev, Beer-Sheva, Israel
    Contribution
    Resources, Methodology
    Competing interests
    No competing interests declared
  12. Brad Grueter

    Department of Psychiatry, Vanderbilt University School of Medicine, Nashville, United States
    Contribution
    Resources, Investigation, Methodology
    Competing interests
    No competing interests declared
    ORCID icon 0000-0002-4224-3866
  13. Alon Zaslaver

    Department of Genetics, Silberman Institute of Life Sciences, The Hebrew University, Jerusalem, Israel
    Contribution
    Software, Formal analysis, Supervision
    Competing interests
    No competing interests declared
  14. Ami Citri

    1. Department of Biological Chemistry, Silberman Institute of Life Sciences, The Hebrew University, Jerusalem, Israel
    2. The Edmond and Lily Safra Center for Brain Sciences, The Hebrew University, Jerusalem, Israel
    3. Child and Brain Development Program, Canadian Institute for Advanced Research, Toronto, Canada
    Contribution
    Conceptualization, Resources, Formal analysis, Supervision, Funding acquisition, Visualization, Methodology, Writing—original draft, Project administration, Writing—review and editing
    For correspondence
    ami.citri@mail.huji.ac.il
    Competing interests
    No competing interests declared
    ORCID icon 0000-0002-9914-0278

Funding

Shimon Peres Postdoctoral Award (Postdoctoral stipend)

  • Bogna Marta Ignatowska-Jankowska

ELSC Postdoctoral Award (Postdoctoral stipend)

  • Bogna Marta Ignatowska-Jankowska

Lady Davis Fellowship Trust, Hebrew University of Jerusalem (Postdoctoral stipend)

  • Bogna Marta Ignatowska-Jankowska

Israel Science Foundation (Personal Grant 393/12 & I-CORE 1796/12)

  • Ami Citri

German-Israeli Foundation for Scientific Research and Development (Young Investigator Award 2299-2291.1./2011)

  • Ami Citri

Brain and Behavior Research Foundation (Young Investigator Award #18795)

  • Ami Citri

Canadian Institute for Advanced Research (Research Support)

  • Ami Citri

Binational United-States Israel Research Foundation (Research Grant #2011266)

  • Ami Citri

Milton Rosenbaum Research Foundation (Research Grant)

  • Ami Citri

National Institute for Psychobiology in Israel, Hebrew University of Jerusalem (Research Grant 109-15-16)

  • Ami Citri

Israel Science Foundation (2341/1)

  • Ami Citri

H2020 Marie Skłodowska-Curie Actions (PCIG13-GA-2013–61820)

  • Ami Citri

Stewart Resnick Foundation

  • Ami Citri

The funders had no role in study design, data collection and interpretation, or the decision to submit the work for publication.

Acknowledgements

This work was funded by grants to AC from the Israel Science Foundation (393/12 and 2341/15), ISF Center of Research Excellence on ‘Chromatin and RNA in Gene Regulation’ (1796/12), EU Marie Curie (PCIG13-GA-2013–618201), the Brain and Behavior Foundation (NARSAD 18795), the German-Israel Foundation (2299–2291.1/2011), the Binational Israel-USA Foundation (2011266), the Milton Rosenbaum Endowment Fund for Research in Psychiatry, the Canadian Institute for Advanced Research, and contributions from Mr. Jaime Cohen (Mexico City) and the Stewart Resnick Foundation (Los Angeles). BIJ was funded by the Shimon Peres Fellowship from the Edmond and Lily Safra Center for Brain Sciences and the Lady Davis Fellowship Trust. Rob Malenka’s generosity in enabling preliminary studies to be performed in his laboratory is highly appreciated. We thank Hermona Soreq, Inbal Goshen, Mickey London, Sagiv Shifman, Zhiping Pang and members of the Citri lab for constructive criticism of the manuscript.

Ethics

Animal experimentation: This study was performed in strict accordance with the recommendations in the Guide for the Care and Use of Laboratory Animals of the National Institutes of Health. All of the animals were handled according to approved institutional animal care and use committee (IACUC) protocols (#NS-13-13895-3; NS-15-14668-3; NS-14-14088-3; NS-15-14312-3; NS-15-14348-3) of the Hebrew University of Jerusalem. The protocol was approved by the Committee on the Ethics of Animal Experiments of the Hebrew University. Every effort was made to minimize suffering.

Reviewing Editor

  1. Sacha B Nelson, Reviewing Editor, Brandeis University, United States

Publication history

  1. Received: September 4, 2017
  2. Accepted: February 5, 2018
  3. Accepted Manuscript published: February 7, 2018 (version 1)
  4. Accepted Manuscript updated: February 22, 2018 (version 2)
  5. Version of Record published: March 21, 2018 (version 3)

Copyright

© 2018, Mukherjee et al.

This article is distributed under the terms of the Creative Commons Attribution License, which permits unrestricted use and redistribution provided that the original author and source are credited.

Metrics

  • 3,844
    Page views
  • 706
    Downloads
  • 2
    Citations

Article citation count generated by polling the highest count across the following sources: Crossref, PubMed Central, Scopus.

Download links

A two-part list of links to download the article, or parts of the article, in various formats.

Downloads (link to download the article as PDF)

Download citations (links to download the citations from this article in formats compatible with various reference manager tools)

Open citations (links to open the citations from this article in various online reference manager services)

  1. Further reading

Further reading

    1. Microbiology and Infectious Disease
    2. Neuroscience
    Yingsong Hao et al.
    Research Article
    1. Cell Biology
    2. Neuroscience
    Andrew T Moehlman et al.
    Research Article