Figures and data

Experimental setup:
(A) We recorded single-unit activity from 21 participants, who were presented with 152 sentences, comprising various contrasts along different linguistic dimensions (subject type, grammatical number, etc.). (B) Sentence stimuli were presented to participants both visually, in rapid serial visual presentation (RSVP), and auditorily, via computer speakers. The experiment was composed of six blocks, three from each modality; SOA: Stimulus Onset Asynchrony. ISI: Inter-Stimulus Interval(C) As part of their clinical evaluation, the patients were implanted with Behnke-Fried depth electrodes [31]: the depth electrode contains macro recording contacts along its shaft and 8 microwires protrude from its end, from which single-cell activity can be extracted. (D) an example for a scan of the brain of one of the patients, and the localization of one of the depth electrodes. (E) A summary of the total number of recorded units (left) microwires (middle) and macro recording sites (right). (F) The recording sites aggregated from all the brains of the patients cover main regions from the language network, including ventral temporal cortex, primary and secondary auditory regions, and inferior frontal regions. Blue and red points correspond to macro and micro contacts, respectively.

From single cells to large neuronal populations:
(A) An illustration of spiking activity of a neuron from the left fusiform gyrus of one of the patients during the processing of all sentence stimuli. (B) The corresponding broadband-gamma activity (BGA) extracted from the same microwire as the neuron. (C) BGA extracted from the nearest macro contact. (D-F) Same for an example single neuron and the corresponding micro- and macro-BGA from the superior temporal gyrus (STG) of another patient. In all panels A-F, trials are sorted based on the length of the sentence (2-5 words). Sentence onset is marked by a vertical dashed line. The estimated time-resolved firing rate and the median BGA are shown at the bottom of each panel. In each panel, top and bottom panels show activity during the visual and auditory blocks, with red and blue backgrounds, respectively. In the fusiform gyrus, neural responses are selective to visual compared to auditory stimuli. For all the three resolutions (single-cell to micro- and macro-population activity), neural responses are highly selective to visual compared to auditory stimuli. In contrast to the fusiform gyrus, neural activity at the STG is highly selective to auditory compared to visual stimuli, also at the single-cell level. (G) Mean Pearson correlation between firing-rate and high-gamma activity for FSG (top) and STG (bottom) units during visual and auditory blocks, respectively. Blue bars correspond to correlation between spiking activity and high-gamma extracted from the same microwire. Orange bars correspond to correlation between spiking activity and high-gamma extracted from the nearest macro contact. (H) Left: Brain-Score distributions for all recorded single units from the fusiform gyrus (top) and STG (bottom), computed separately for the visual (red) and auditory (blue) blocks. Middle: same for microwires; Right: Same for macro contacts.

Selective encoding of orthographic features in the fusiform cortex:
To study the information encoded in firing patterns of single neurons, we fitted firing rates to diagnostic encoding models which contained features from only the most dominant feature group in the visual or auditory-block model. (A-D) An in-depth analysis of an example single neuron from the fusiform gyrus. (A) The most dominant feature-group in the visual-block model was for orthography. Feature importance was measured as the difference in model performance with vs. without each feature group. Bars correspond to the mean significant feature importance within 0-600 msec (FDR corrected). (B) Given the results of the visual-block model, the diagnostic model contained features only from the orthography group - letter identities and word length. The letter features that achieved highest importance score were: ‘k’, ‘v’, ‘w’, ‘y’ and ‘z’. Note that all these letters contain a V-shape feature in various orientations. (C) To study the effect of letter position on spiking activity, we trained a second diagnostic model, which contained features that couple letter identity (26 letters) and letter position (3 levels, depending on whether the letter appeared at the beginning, middle, or end of the word). The resulting weights of the model features are plotted for each of the three positions, with font size corresponding to the size of the weight. Red and blue represent positive and negative values, respectively. (D) An illustration of the spiking activity of this neuron, grouped based on whether the word contains (top panel) or not (bottom) a letter with a V-shape feature. Word onset is marked with a vertical dashed line. The bottom panel shows the mean firing rate for each group. (E-H) Same analyses for another example neuron, which shows high selectivity to a single letter, with increased firing rate in response to words that contain the letter ‘w’, only if it appeared at the middle or end positions.

Neural Encoding of Word Length:
(A) Principal Component Analysis (PCA) of spiking activity from the 5 fusiform neurons with the highest Brain-Score. The neural response of each neuron was represented along 5 consecutive 100ms time bins between 0.1 and 0.6 seconds after word onset. The population response to different words is projected onto the two main PCs. Words are colored by their length. (B) Raster plots from two example neurons from the FSG with significant feature-importance of word length. Different raster panels correspond to different word lengths (y-label colors). Bottom panel shows the mean firing rate across trials, split into different word lengths (same color code as y-labels), which increases as a function of word length. (C) Same as previous panel but for a neuron from the FSG whose activity is not sensitive to word length.

Selective encoding of phonological information during auditory processing:
(A-D): An in-depth analysis of an example single neuron from the left Heschel Gyrus. (A) Feature-group importance for the five groups in the temporal receptive field (TRF). The feature group that had largest effect on spiking activity was that of phonological features. (B) Time-resolved feature importance for all phones. The phone that had the largest effect on spiking activity was the phone /SH/. (C) Phone-in-position feature importance, showing the effect of the phone /SH/ and /JH/ on spiking activity. (D) Raster plots for several example phones, illustrating the high selectivity of this cell to the phone /SH/. Vertical dashed black lines represent phoneme onset. Horizontal vertical line represents period with a significant difference among conditions (a temporal cluster-based permutation test, p − value < 0.05) (E-H) Same analyses for another example cell, which shows high selectivity to a single vocalic diphthong.

Modality-specific and Amodal Activity across the language network:
To quantify modality specificity of all neural responses, for each neuron and for each recording electrode, we trained two encoding models, one for each of the two modalities - a visual-block and auditory-block models. The models contained features from various linguistic aspects: The visual-block model contained orthographic, lexical, semantic and syntactic features, as well positional features, such as sentence onset or word position. The auditory-block model contained the same features but ortho-graphic features were replaced with phonological ones. We evaluated the visual- and auditory-block models by testing their performance in predicting unseen neural data, in a cross-validation (CV) procedure. We define the Brain-Score as the mean performance of the models across all CV splits. (A) Brain-Score differences between the auditory and visual blocks. Red and blue shades correspond to visual and auditory specificity, respectively. Green corresponds to amodal regions. (B) A scatter plot illustrating Brain-Score for auditory vs. visual block models. Each dot corresponds to a single-cell (squares), a microwire (circles) or a macro contact (X marks). Colors correspond to selected regions of interests. Dots on the lower/upper triangle correspond to recording sites with preference to auditory/visual stimuli. Dots on the diagonal corresponds to amodal recording sites. (C) Single-cell activity of several amodal cells from the MTG and right IFG. (D) Spiking activity of an MTG amodal cell, in responses to two-word sentences, either questions (dashed lines) or declaratives (continuous).

List of Main Contrasts in the Experiment.

Feature Groups
Feature types used in each of the feature groups in the encoding TRF models.

Auditory stimuli:
An example of the waveform and spectrogram of one of the auditory stimuli in the experiment. The parsing to single phonemes and words is shown at the bottom two rows, based on Montreal Forced Alignment [79]. Light blue and yellow marking on the spectrogram refer to pitch and intensity, respectively.

Modality Selectivity of fusiform neurons:
Spiking activity from 5 neurons from the fusiform gyrus. All neurons show strong preference to the visual modality, remaining at baseline activity in response to audiory stimuli.

Modality Selectivity of STG neurons:
Spiking activity from 17 example neurons from the Superior Temporal gyrus. All neurons show strong preference to the auditory modality, remaining at baseline activity in response to visual stimuli.

Modality Selectivity of STG and HSG neurons:
Spiking activity from more 5 example neurons from the Superior Temporal gyrus and 11 from the Heschel Gyrus (HSG). All neurons show strong preference to the auditory modality, remaining at baseline activity in response to visual stimuli.

Micro and Macro-BGA Encoding of Orthographic Information during Visual Processing:
(A) Micro broadband high-gamma activity (BGA) during the processing of words that contain forked letters (top panel) and those that do not (bottom). Micro BGA was extracted from the same microwire from which spiking activity was recorded in Figure 3A-D. Median BGA is shown at the bottom, separated for the two conditions. Time periods with significant separation (cluster-based permutation; p − value < 0.05 are marked with a green strip. (B) Same for macro BGA. (C+D) Micro and macro-BGA for the contrast shown in 3E-H. Significant separation is observed for micro-BGA, and to a lesser extent for macro BGA.

Micro and Macro-BGA Encoding of Phonological Information during Auditory Processing:
(A) Micro broadband high-gamma activity (BGA) during the processing of words that contain hushing-sibilants (top panels) and those that do not. Micro BGA was extracted from the same microwire from which spiking activity was recorded in Figure 5A-D. Median BGA is shown at the bottom, separated for the two conditions. Time periods with significant separation (cluster-based permutation; p −value < 0.05 are marked with a green strip. (B) Same for macro BGA. (C+D) Micro and macro-BGA for the contrast shown in 5E-H. Significant separation is observed for micro-BGA but not for macro BGA.

Electrode Localization

Electrode Localization

Electrode Localization

Electrode Localization

Spike Sorting:
For each unit, the results of the spike sorting are summarized in: (1) the profile of the cluster (top row, including in log scale), (2) the inter-spike-interval distribution (middle-left panel), (3) the cumulative spike count (center), (4) the distribution of the spike amplitude (middle-right panel), and (5) amplitude of all spikes across the entire experiment (bottom row).

Neural Encoding of Word Length (micro BGA):
(A) Principal Component Analysis (PCA) of micro BGA activity from the 5 fusiform neurons with the highest Brain-Score. The neural response of each neuron was represented along 5 consecutive 100ms time bins between 0.1-0.6sec after word onset. The population response to different words is projected onto the two main PCs (39.7% and 9.9% of the variance). Words are colored by their length.

Feature Importances for Amodal Neurons:
For each amodal unit in Figure 6, the corresponding scatter contrasts group feature importances (FI) for the two blocks (visual and auditory). Each point in a scatter corresponds to a feature group (see legend for color code). Points on the diagonal represent FIs that are similar across the two blocks.