Replay of rule-learning related neural patterns in the prefrontal cortex during sleep
Adrien Peyrache1, Mehdi Khamassi1,2, Karim Benchenane1, Sidney I. Wiener1*, and Francesco P. Battaglia1,3
1 Laboratoire de Physiologie de la Perception et de l’Action, Collège de France, CNRS, 11 place Marcelin Berthelot, 75231 Paris CEDEX 05, France
2 Institut des Systèmes Intelligents et de Robotique, Université Pierre et Marie Curie – Paris 6, CNRS FRE 2507, 4 Place Jussieu, 75252 Paris Cedex, France
3 Center for Neuroscience, Swammerdam Institute for Life Sciences, Faculty of Science, Universiteit van Amsterdam, P.O. Box 94084, Kruislaan 320, 1090GB Amsterdam, The Netherlands
Slow-wave sleep (SWS) is important for memory consolidation. During sleep, neural patterns reflecting previously acquired information are replayed. One possibility is that such replay exchanges information between hippocampus and neocortex, supporting consolidation. We recorded neuron ensembles in the rat medial prefrontal cortex (mPFC) to study memory trace reactivation during SWS following learning and execution of cross-modal strategy shifts. In general, reactivation of learning-related patterns occurred in distinct, highly synchronized transient bouts, mostly simultaneous with hippocampal sharp wave/ripple complexes (SPWRs), when hippocampal ensemble reactivation and cortico-hippocampal interaction is enhanced. mPFC neural patterns appearing during response selection replayed prominently coincident with hippocampal SPWRs taking place in sleep, following learning of a new rule. This was learning-dependent, because it was not observed before rule acquisition. Thus, learning, or the resulting stable reward, influenced which patterns were most strongly encoded, and successively reactivated, in the hippocampal/prefrontal network.
The acquisition of labile new memories can trigger processes spanning from molecular1 to system-wide levels, gradually transforming and stabilizing memory traces. The system consolidation theory views the interaction between hippocampus and neocortex as instrumental for this2-4. While the hippocampus is vital in the initial acquisition and early storage of memories, the cerebral cortex, among other structures, play crucial roles later on5. The exchange between a fast-learning module (the hippocampus) and a slower one (the neocortex) would take place mainly after memory acquisition, allowing one-shot acquisition of new items without losses of older memories because of interference2,3. A further role of slow consolidation following acquisition would be to re-organize memories into more semanticized, de-contextualized representations6-8.
A role for slow-wave sleep (SWS) in such exchange3,5,9-11 would be the replay of neural patterns reflecting previously acquired information12-18. Such sleep replay would then instill a change in the neural substrate of memory traces, and ultimately favor memory consolidation. During sleep, the hippocampus and the neocortex engage in a dialogue which involves and affects the dynamical states of both19-23. Hippocampal sharp waves/ripple complexes (SPWR) are likely vectors for hippocampal-neocortical information exchange24: SPWRs25 are brief (~50-150 ms), large bursts of hippocampal activity, mostly observed during SWS or immobility and correspond to increased hippocampal memory reactivation14. During SWS, neocortical activity displays periods of large, synchronous oscillations (0.1-4 Hz) of membrane potentials and neural firing26, and these are correlated with SPWRs19-21. Slow oscillations were recently found to coordinate episodes of visual cortical and hippocampal reactivation16, but the precise temporal relationship between cortical and hippocampal replay remains unknown.
The prefrontal cortex (PFC) is often implicated in long-term memory consolidation 27 in particular for hippocampally-dependent spatial and contextual information. Indeed, the PFC shows detailed, time-compressed replay following initial acquisition of memory-related sequences of neural ensemble activation in rats17 and increased coordination with the hippocampus during retrieval of sleep-consolidated memories in humans28. The PFC is one of the neocortical areas most closely associated to the hippocampus, both anatomically and physiologically because it has a unique afferent pathway from the hippocampus29 endowed with synaptic plasticity30. Some functional imaging and immediate early genes expression data support the idea that, during consolidation, the hippocampus activity contributions decrease over time, with an opposite, increasing trend observed for the PFC27,28,31. However, the concerted function of PFC and hippocampus is also necessary for memory maintenance during task performance32,33.
While the behavioral electrophysiology literature provides numerous examples of memory replay 12,14-18, the animals in these studies were over-trained, and little learning actually took place, or, no specific analysis of the evolution of replay with task performance was attempted13. The goal here is to investigate memory reactivation and hippocampal-neocortical interactions while new task-relevant information is actually being acquired in order to better characterize the link between learning and memory replay processes. Moreover, this study focussed on sub-second resolution of the time course of memory replay, in order to study of precise correlations between replay events and large-scale synchronization phenomena during SWS, such as SPWRs and slow oscillations. Indeed learning-related changes in neural activity over brief time scales have been described in both the prefrontal cortex34,35 and hippocampus, but the effects of these changes on the subsequent sleep activity has not yet been studied. We recorded neural activity in the PFC and the hippocampus in rats during a cross-modal rule shift task (known to implicate the medial PFC36), which allowed to introduce novel elements in the form of new rules, while leaving the perceptual aspects of the task unchanged.
mPFC ensemble patterns during a rule shift task
Multiple tetrodes recorded ensembles of medial PFC (mPFC; see Methods, Supplementary Fig. 2 online) neurons, together with mPFC and hippocampal local field potentials (LFP) in rats. The animals had to perform a task on a Y maze (Supplementary Fig. 1), where the animal had to select the rewarded arm using one of four possible rules (left arm, right arm, illuminated arm, non-illuminated arm; at each trial one of the two target arms was illuminated at random). This period will be referred to as the AWAKE epoch. Neural activity was monitored also during rest periods immediately before and after the AWAKE epoch (PRE and POST epochs). As soon as the rat achieved criterion performance (See Methods) according to the current rule, the rule was changed without any additional cue, and the rat had to again infer the new rule from the pattern of rewarded and non-rewarded arms. Because no pre-training was performed prior to the electrophysiological recordings, during the experiments the rats encountered novel rules, to which they had never been exposed before.
1692 cells were recorded in the mPFC (Supplementary Fig. 2) from four rats, during a total of 63 recording sessions (Rat 15: 16; Rat 18: 11; Rat 19: 12; Rat 20: 24). Only sessions with a minimum of 10 cells and at least four minutes of SWS in each rest epoch were analyzed.
Cells in the mPFC had diverse behavioral correlates, corresponding to one or more task phases, and in some neurons responses dynamically adapted as the rat acquired the current task rule (Battaglia et al, SfN abstract 2006). We used principal component (PC) analysis to extract the neural patterns characteristic of the AWAKE epoch (high rank principal components, associated with larger eigenvalues, or encoding strengths, will be referred to as signal components, while lower rank, or non-signal components, mostly reflect noise). (see Methods, Supplementary information, Supplementary Fig.32).
Signal components identified neuronal assemblies with reliable and consistent responses in the task. For example, they assigned same-signed weights to cells with similar behavioral correlates, and opposite-signed weights to cells with complementary correlates (Fig. 1A; Supplementary Fig. 4). Based on the eigenvalues associated with the PCs, and on a threshold value computed on the basis of the null hypothesis of random, uncorrelated spike trains, we could typically discriminate 1-6 signal components (and occasionally more) in each session (Fig. 1B). The patterns of activity detected by PCs are correlated with behavior: in this example session, the first PC (PC1) showed a positive peak activation right after trial onset (Fig. 1C), PC2 peaked later in the trial, and PC3 even later on. Moreover PC1 and PC2 increased their score and PC3 decreased its score as the rat, across trials, abandoned the strategy of always going to the right arm (Fig. 1C, trials indicated with a green background in the right panel), and instead chose, with a great probability to alternate between the two target arms. Thus, PC1, 2, and 3 extracted patterns of activity that correlated both with trial phase, and at a larger time scale, with the strategy that the rat followed in a block of trials.
FIGURE 1 near here
Transient, synchronized replay of AWAKE patterns
In order to assess the nature and extent of the interaction between prefrontal cortex and hippocampus in memory replay, we characterized the detailed time course of replay during rest episodes. For this, we computed the instantaneous reactivation strength (see Methods, Supplementary Information) of the signal components computed from the AWAKE epoch. At each moment (with a resolution of 100 ms, unless specified otherwise), reactivation strength assesses the similarity between reference AWAKE signal components and the rest period neural activity.
FIGURE 2 near here
During POST SWS, signal components reappeared more frequently and strongly than in PRE (e.g., Fig. 2A), confirming that experience-related patterns are reactivated in mPFC17 in ensuing sleep. No such effects were observed in the rest periods that were not classified as SWS, as shown in Supplementary Fig. 5; thus further analyses were restricted to SWS. PRE and POST SWS did not differ in terms of average duration of the sleep episode, average population firing rates, rates of occurrence of delta waves and SPWRs, or local field potential power in the delta and spindle ranges (Supplementary Fig. 6). The average reactivation strength was greater during POST SWS than PRE SWS for signal components (Fig. 2B-C; p < 0.005 all comparisons. N = 10, 40, 273 for the three signal groups observed here, sorted according to their encoding strength; N = 811 for non-signal components). Reactivation strength correlated positively with encoding strength (r2 = 0.61, p<1e-30 Pearson correlation test, N = 323; Fig. 2B; Supplementary Fig. 7). Thus, the patterns most active in the mPFC during AWAKE, are preferentially reactivated during the following SWS, similarly to previous observations in the hippocampus18. During PRE SWS, this relationship was significantly weaker with respect to slope (p<1e-20; N = 323) and correlation (p<1e-5; N = 323, Supplementary Fig. 7). These observations were not likely to erroneously result from potentially faulty spike sorting, as they persisted when cell pairs discriminated from the same tetrode were ignored. Moreover, cell pairs from the same tetrodes, when considered alone, showed no replay effect, so that virtually all contributions to the replay results come from the correlations between neurons recorded with different tetrodes (Supplementary Fig. 8).
Strikingly, replay occurred in distinct events of strong signal reactivation in POST SWS (Fig 2A), denoting synchronous transient activation of the cell assemblies identified by the signal components. Histograms of the reactivation strengths for POST SWS were heavy-tailed (Fig. 2D right), with the tail constituting the main difference with PRE SWS (Fig. 2D left). The bulk of the distribution was similar in PRE and POST. This is reflected in the significant difference between POST and PRE in skewness of the signal reactivation strength histograms (p<0.05; t-test N = 10-40), most markedly for patterns with higher encoding strengths (Fig. 2E). The peaks in reactivation strength correspond to the transient, coordinated activation of the cells that are assigned a large weight in the relative principal component (Supplementary Fig 9). Those cells are the ones that have the greatest contribution to the total reactivation strength. Different PCs recruit different, rarely overlapping sets of high-weight strengths. Interestingly, during sleep, reactivation strengths for simultaneously recorded principal components tended not to peak at the same times, rather, concomitant activation of different principal component-related patterns was less than expected by chance, as can be inferred from the zero lag trough in their cross-correlograms (Supplementary Figs 9 and 10). Shuffled controls show that this cannot be explained by global fluctuations in the population firing rate alone (Supplementary Fig. 11).
To assess the prevalence of the strongest transient cell assembly activations, we computed the cumulative contribution to the epoch-wide reactivation of events with reactivation strengths up to certain values for POST SWS, and subtracted the same measure for PRE SWS. This cumulative contribution (Fig. 2F) increased steadily over two orders of magnitudes, and 40-50% of the net reactivation (difference between POST and PRE) came from events with reactivation strengths beyond the 99th percentile. Thus, rare events of elevated network synchronization or network spikes37,38, while spanning only a small period of time, account for a substantial proportion of the total observed reactivation.
Preferential mPFC replay during hippocampal SPWRs
The standard systems consolidation theory holds that, following acquisition, experience-related information flows from the hippocampus towards the neocortex24. Conversely, neocortical influences may contribute to selecting the pattern reactivated in the hippocampus22. We tested the relationship between mPFC cell assembly reactivation and hippocampal SPWRs, the most prominent pattern of hippocampal activation during SWS (which, like reactivation strength peaks, occur irregularly). Indeed, cortical assembly reactivation events occurred in concert with hippocampal SPWRs. Examining data from entire POST sessions (Fig. 3) reveals that virtually all reactivation peaks occur concomitantly with a SPWR event (and also with an increase in synchronous activity of those cells with large positive weights in this signal component). In the example in Fig. 4, the ensemble spike trains corresponding to a reactivation peak are shown (red ticks in Fig. 4E): at the time of the peak, virtually all cells with large positive weights in this signal component were active (and negative weight cells reduced their activity). In this example, the two largest peaks (Fig 4A) coincided with SPWR events (red asterisks in Fig 4B), and one preceded a delta wave (Fig. 4C).
FIGURES 3, 4 near here
The average reactivation strength in POST SWS was considerably greater for bins coinciding with SPWRs than for non-SPWR bins (all comparisons p<0.005; N = 8, 37, 225; including only sessions with reliable discrimination of ripple signals; Fig. 5A). The effect was stronger for components with higher encoding strengths (Pearson's correlation test, p<1e-20, Supplementary Fig. 7). The SPWR-triggered average (Fig. 5B) showed that during POST (but not PRE), SWS reactivation strength for signal components increased by ~70% at the time of the sharp waves with respect to baseline (p<1e-10, N=270). Reactivation in mPFC declined to baseline values within 1 s before and after the peak of the SPWR events. No such effect was found for non-signal components. A similar analysis at higher time resolution (Fig. 5C) showed that reactivation peaked ~40 ms after SPWR occurrences, which is compatible with the transmission delay measured for prefrontal responses to hippocampal stimulation39 (the second peak in the event triggered histogram is likely due to the frequent occurrence of sharp wave “doublets”). On the other hand, overall ensemble mPFC activity (of all recorded neurons, including those not involved in signal components) showed a qualitatively different, sharply asymmetric profile with respect to SPWR occurrences (Fig. 5D): on average, prefrontal population activity transiently increased with the SPWRs, and maintained sustained activity thereafter20,21 (with no difference between PRE and POST). This sustained post-SPWR activity contrasts with the faster decay of signal reactivation, arguing against an explanation of the latter solely in terms of general population activity fluctuations. Furthermore, autocorrelograms of both reactivation strength and SPWR occurrences (Fig. 5E) decay with very similar time constants (respectively 150 and 160 ms for exponential fits), suggesting that the clustering in time of SPWRs is reflected by a similar grouping of reactivation events.
FIGURE 5 near here
relation of slow oscillations with SPWRs and mPFC replay
A hallmark of cortical activity during SWS is slow oscillations26, which trigger and orchestrate LFP waves in the delta (2-4 Hz) and spindle (10-20 Hz) ranges.Reactivation episodes in the hippocampus and neocortex coincide with the slow oscillation phase with high neural activity 16 (UP state) and are correlated with hippocampal SPWRs, but little is known about the precise temporal relation between cortical oscillatory phenomena, hippocampal activity and neocortical reactivation. In Fig. 2A the relation between mPFC reactivation and SWS oscillations is shown: episodes of strong replay were significantly concentrated into periods of elevated prefrontal LFP oscillatory activity in the delta (2-4 Hz) and spindle (10-20 Hz) ranges (p < 1e-5 for all, t-test; Supplementary Fig. 5). Thus, we tested the correlation between reactivation strength and LFP markers of slow oscillations. First, we considered delta waves, large positivities of the depth cortical LFP, associated with states of reduced cortical activity (DOWN states), and with the K-complex phase characterized by absence of spindles40. During POST SWS, reactivation strength for signal components showed a significant (p < 0.001, t-test) increase ~ 400 ms prior to the peak of the delta wave (Fig. 6A top). This was experience-dependent and possibly memory related, since the modulation was smaller for PRE SWS and null for non-signal components. The timing of hippocampal SPWRs relative to delta peaks closely resembled that of mPFC reactivation (Fig. 6A middle). In contrast, mPFC ensemble activity showed a different profile (Fig. 6A bottom) with a minimum immediately prior to the peak of the delta wave, but symmetric peaks before and after (with a return to baseline in 500-1000 ms). The second peak was not associated to an increase in reactivation.
To further investigate this relationship, the same analysis was performed with times of putative DOWN to UP state transitions (Fig. 6B) (putative DOWN states were defined as a decrease of neural activity in windows of at least 80 ms). The relations between reactivation SPWR occurrence with these transitions were comparable to the delta wave results.
Because spindles (bouts of 10-20 Hz oscillations) appear at the onset of UP states41, we examined their correlation with reactivation strength. In general, signal reactivation tended to occur before spindle episodes. Reactivation event-triggered averages centered on spindles troughs are asymmetric, with an increase in reactivation in the ~1 s preceding spindles compared to the period thereafter (p < 0.001, t-test; Fig. 6C top). As is the case for delta waves, the increased pre-spindle reactivation over a broad time scale echoes the increased probability for hippocampal sharp waves preceding spindles (Fig. 6C middle). In contrast, the population activity modulation showed a symmetrical time course peaked at the time of spindle events (Fig. 6C bottom).
The respective cross-correlograms and Event-Time Averages of reactivation relative to these three cortical events were strongly correlated (Fig. 6D). Thus, coupling between reactivation and sharp waves primarily structured the relationship between the reactivation time course and cortical slow oscillations.
FIGURE 6 near here
Salient behavioral events, rule learning increase replay
The PC analysis that characterized the time course of experience-related pattern reactivation during sleep may conversely be employed to find out which aspects of the neural assembly coactivations during task performance are replayed during sleep. For this, PCs were computed from the ensemble neural activity during sleep PRE and POST, separately for SPWR and inter-SPWR time bins. Those templates were matched to the activity during the AWAKE period. For a significant number of components extracted from SPWR bins (Fig. 7A-B; Supplementary Fig. 12), co-activations became stronger as the rat started a run of correct trials, signaling rule acquisition. This difference was not significant for PCs computed from the inter-SPWR intervals in POST, or from PRE. This was not simply due to the elapsed time during the session, as there was no such difference between the two halves of those sessions where no rule learning occurred.
Furthermore, the PCs computed from POST SPWR appeared primarily when the rat was on the central platform of the Y maze, that is, the point where it was required to select the behavioral response (Fig. 7A,C). A significant effect of learning on the spatial distribution of PCs from POST SPWR appeared only in the part of the maze going from the central platform to the end of the target arm (Fig. 7C). Moreover, a factor analysis of these spatial distributions revealed that the two most important factors (which ones?) are concentrated on the platform and on the target arm respectively (Fig. 7D; see also Supplementary information).
Rule acquisition was not accompanied by a change in principal measures of the rat behavior, including terms of (in term of) trial duration, of trajectories (which followed the same stereotyped paths before and after rule acquisition), and running speed at each point of the trajectory (Supplementary Fig. 12). Moreover, the greater contribution to reactivated patterns during SPWRs from trials occurring right after rule acquisition is not likely to be due to changes in the general sensory experience other than reward. To test this hypothesis, we compared trials occurring before and after spontaneous strategy shifts operated by the rat which did not lead to acquisition of the rewarded strategy, in days in which (when) no learning took place. Rats operated this sort of shifts while seeking the correct rule by trial and error. In these cases, no difference was observed in contribution to reactivated patterns during SPWRs (Fig. 7B; Supplementary Fig. 13; Supplementary Information).
FIGURE 7 near here