Dynamic perceptual feature selectivity in primary somatosensory cortex upon reversal learning

Play all audios:

ABSTRACT Neurons in primary sensory cortex encode a variety of stimulus features upon perceptual learning. However, it is unclear whether the acquired stimulus selectivity remains stable

when the same input is perceived in a different context. Here, we monitor the activity of individual neurons in the mouse primary somatosensory cortex during reward-based texture

discrimination. We track their stimulus selectivity before and after changing reward contingencies, which allows us to identify various classes of neurons. We find neurons that stably

represented a texture or the upcoming behavioral choice, but the majority is dynamic. Among those, a subpopulation of neurons regains texture selectivity contingent on the associated reward

value. These value-sensitive neurons forecast the onset of learning by displaying a distinct and transient increase in activity, depending on past behavioral experience. Thus, stimulus

selectivity of excitatory neurons during perceptual learning is dynamic and largely relies on behavioral contingencies, even in primary sensory cortex. SIMILAR CONTENT BEING VIEWED BY OTHERS

LEARNING ENHANCES ENCODING OF TIME AND TEMPORAL SURPRISE IN MOUSE PRIMARY SENSORY CORTEX Article Open access 20 September 2022 BEHAVIORALLY RELEVANT DECISION CODING IN PRIMARY SOMATOSENSORY

CORTEX NEURONS Article 30 August 2022 TEXTURE IS ENCODED IN PRECISE TEMPORAL SPIKING PATTERNS IN PRIMATE SOMATOSENSORY CORTEX Article Open access 14 March 2022 INTRODUCTION The mammalian

cortex encodes a myriad of sensory signal characteristics which are represented by neuronal assemblies, each with a preference for specific stimulus parameters1,2. It is believed that these

assemblies are organized in a hierarchical fashion. First-order sensory areas encode lower-order stimulus features, such as texture coarseness3,4,5, object orientation and direction6, and

sound frequency7, whereas more complex features and contextual aspects of a stimulus are encoded by higher-order cortices8,9,10,11. Nonetheless, the coding in primary sensory cortices can

exhibit higher levels of complexity, expressing non-sensory-related signals such as attention12, anticipation13, and behavioral choice11,12,13,14,15. Reward-based perceptual learning

initially shapes the stimulus selectivity and response properties of primary sensory neurons, which may contribute to a reliable detection of particular features, and thereby improve

perception13,16,17,18. However, it is unclear as to whether the stimulus preference of those neurons remains stable when the reward contingencies are changed. To study this, we monitor the

shaping of stimulus selectivity for primary somatosensory cortical (S1) layer 2/3 (L2/3) neurons in mice that learn to discriminate between a rewarded and non-rewarded texture. We then

reassess their selectivity upon reversal learning, which reveals a substantial subset of neurons that dynamically represents textures. Many lose or gain selectivity. Yet another class, which

we term value-sensitive neurons, first lose and then regain texture selectivity contingent on the associated reward. The ramping up of this selectivity forecasts the onset of learning.

RESULTS TEXTURE SELECTIVITY OF L2/3 NEURONS INCREASES WITH LEARNING We trained mice on a head-fixed ‘Go/No-go’ texture discrimination task, similar to previous designs5 (Fig. 1a). Thirsted

animals were incited to lick a spout during a 2-s texture presentation in the form of a piece of P120 sandpaper (125-μm grit size; rewarded texture), in order to trigger the supply of a

water reward at the end of the presentation period (scored as a ‘hit’ trial; Fig. 1a and Supplementary Fig. 1). A failure to lick was scored as a ‘miss’ trial. The animals needed to withhold

from licking upon presentation of a P280 sandpaper (52-μm grit size; non-rewarded texture) to avoid a 200-ms white noise and a 5-s timeout period (scored as a ‘correct reject’ trial). A

failure to withhold from licking was scored as a ‘false alarm’ trial (Fig. 1a and Supplementary Fig. 1). Mice learned to discriminate between the two stimuli (Fig. 1b). They typically

started at chance level (naïve) and reached an average performance level of 82% within 3–7 days (expert mice) (Fig. 1b)13,14,15,19. To verify that the task was whisker-dependent and involved

the cortex, we trimmed the whiskers ipsilateral to the texture, or suppressed contralateral cortical activity using a local injection of the γ-aminobutyric acid receptor (GABAR) agonist

muscimol in separate sets of expert mice (see Methods). Both treatments reduced the performance to chance level (Fig. 1c, d). This indicates that to solve this task mice fully rely on

somatosensory input and do not use additional sensory information, and that the task involves signal processing through S1. In order to monitor the activity of S1 neurons during texture

discrimination learning, we co-expressed the genetically encoded calcium sensor GCaMP6s and the cell filler mRuby2, predominantly in excitatory L2/3 neurons using adeno-associated viral

vectors (Fig. 1e, Supplementary Fig. 2)20. Single-cell calcium signals were recorded using two-photon laser scanning microscopy (2PLSM; Fig. 1e, f). Fast-volumetric imaging was performed to

allow for the correction of axial motion artifacts (Fig. 1e, Methods section)21. Similar to previous studies5,14, a fraction of the neurons displayed a differential response to the textures

(Fig. 1f and Supplementary Fig. 3). In order to determine the texture selectivity of individual neurons during learning we compared the calcium signal amplitudes evoked by the two different

sandpapers using a receiver-operating characteristic (ROC) curve analysis. This provided a discrimination index for each neuron (DI; Fig. 1g, Methods section)22. On average, the fraction of

selective neurons increased with learning (Fig. 1h). Interestingly, we observed that in expert mice, a larger fraction of the recorded population was selective for the P120 (rewarded) as

compared with the P280 (non-rewarded) texture (Fig. 1i) and that this difference built up with learning (Fig. 1j). What could be the cause of the increase in selectivity bias during

learning? One explanation holds that the neuronal responses strictly correlate with the different behaviors the animals exhibit during Go and No-go trials, which emerges with learning

(Supplementary Fig. 1). In that case, the neuronal activity could be linked to the motor-output that is associated with licking, and not exclusively to the presented texture. Alternatively,

L2/3 neurons could encode higher-order features that are associated with the textures (such as the reward value or the behavioral choice). To explore these possibilities, we first conducted

experiments that allowed us to categorize neurons based on their activity in relation to the animal’s licking and whisking behavior, and then we reassessed their selectivity after inverting

the reward-contingencies. NEURONAL ACTIVITY REPRESENTS SENSORY INPUT We first investigated the possibility that the P120-selective neurons were merely reporting licking, by comparing for all

hit trials, the delay between the onset of the calcium signal and the time of texture presentation or the time of the 1st lick. For the majority of neurons, the rise in the calcium signal

occurred immediately after the texture presentation and preceded the 1st lick with a larger jitter (Fig. 2a–c). This suggests that the activity of the P120-selective neurons was evoked by

the texture and not by licking. However, this analysis did not exclude the possibility that selectivity had been influenced by an increasingly stereotyped behavioral sequence during

learning, including whisking. To dissociate sensory-evoked neuronal activity from activity that was primarily related to whisking or licking we exposed mice to the various task-related

stimuli before the training had started. The stimuli were presented separately and without a temporal structure (Fig. 3a). We also monitored the animal’s whisking and licking behavior.

Together, this allowed us to categorize neurons based on their activity in relation to the sound cue, texture presentation, as well as whisking and licking behavior. We found that a large

fraction of neurons (36.7% of the total population) exhibited touch-related activity during texture presentation while few neurons were sensitive to the auditory cue (0.8%; Fig. 3b). Within

the pool of touch-sensitive neurons there was no bias in texture selectivity (Fig. 3c). This suggests that the imaged population was not a priori preferring any of the two textures, which is

in line with previous work4. Then we determined whether neurons showed whisking or licking-related responses. We trained a random forests machine-learning model using the inferred firing

rates from the calcium signal to assess for each neuron if its activity could predict whisking and/or licking rates. The model was trained using a range of positive and negative time lags of

the neuronal activity relative to behavior, in order to account for possible pre-motor related activity (i.e. preceding the behavior) and/or sensory-related activity (i.e. following the

behavior). For each neuron we calculated the prediction power (PP), which reflected the correlation between the animals’ actual whisking and licking behavior, and the behavior that was

predicted by its activity (Fig. 3d). We plotted the PP distributions for whisking and licking rates as inferred from the GCaMP6s signal. This was compared to a control distribution that was

inferred from the mRuby2 signal to assess the noise in PP measurement (Fig. 3e, f). Neurons with a PP over a threshold criterion of five standard deviations above the mean of the control

distribution were considered to be predictive of whisking and/or licking. We found that 9.4% of the neurons were partially predicting the animal’s whisking rate whereas only 2% predicted

licking rates (Fig. 3e–g). We then compared the resulting categories with the selectivity that the neurons displayed in the subsequent texture discrimination task. Most of the neurons that

were found to be selective after training had formerly been categorized as undefined or reporting touch (88%; Fig. 3g). Altogether, these data strongly suggest that the stimulus-selective

neurons did not exclusively signal whisking or licking behavior during the task. Moreover, only 11% of the P120-selective neurons were predicting the animal’s whisking rate and 0% the

licking rate. Thus, the biased increase in P120 selectivity during texture discrimination learning could not be explained by mere changes in the animal’s whisking or licking behavior.

TEXTURE SELECTIVITY IS DYNAMIC UPON REVERSAL LEARNING Studies using comparable paradigms have reported that S1 neurons exhibit selectivity not only for the tactile stimulus but also for the

behavioral choice5,14,15. In order to test this, we uncoupled the behavioral choice from the respective textures by inverting the reward contingencies. This allowed us to assess which

neurons were persistently selective for a given texture, and which were dynamic. To this end, expert mice were continued to be trained on the same textures, but now the detection of the P280

texture was rewarded and the P120 texture was not (Fig. 4a). Upon reversal the performance initially dropped to chance level (the post-reversal naïve phase; Fig. 4b) before it reached the

expert criterion again within 2–4 days (the post-reversal expert phase). In the post-reversal naïve phase, the neuronal population’s average DI remained of the same sign as compared with the

pre-reversal expert phase. However, we observed an inversion of the DI’s sign in the post-reversal expert phase (Fig. 4c), indicating that many neurons had changed their texture selectivity

during reversal learning. By comparing the DI of each neuron over expert sessions before and after reversal we could define a variety of neuronal classes, including those that remained

selective for the same texture (4%; e.g. neuron 1 in Fig. 4d), those that reversed their selectivity to the other texture (and thus invariably reported textures contingent on the associated

reward, 8%; e.g. neuron 2 in Fig. 4d), and those that had lost (19%) or gained (18%) selectivity altogether (Fig. 5a–c). Overall, the population regained a selectivity bias for the rewarded

texture (Fig. 5c, d). The changes in selectivity could be the result of network plasticity. To assess this, we calculated the level of co-fluctuation in spontaneous activity within the

groups that had lost or gained selectivity, which may reflect the level of mutual connectivity23,24,25. Upon reversal, the level of co-fluctuation increased for gained neurons and decreased

for lost neurons (Fig. 5e). This may indicate that reversal learning promotes the rewiring of local synaptic circuits. We also checked whether the various classes correlated with the

animal’s whisking or licking behavior. We found no difference in the average calcium signal for any of the classes above when comparing trials for which the animal displayed high whisking or

licking rates with low-rate trials (Fig. 5f, g and Supplementary Fig. 4a, b). This result is in line with the decoding model (Fig. 3) and indicates that the dynamics in selectivity observed

after reversal learning cannot be attributed to alterations in whisking and licking. Altogether, the reversal learning experiment shows that texture selectivity of L2/3 neurons in S1 is

largely dynamic, with a fraction of neurons reversing their texture selectivity congruent with the reward contingency. This suggests that although for some neurons selectivity is determined

solely by the texture attributes of the stimuli, for many others it is shaped by higher-order features that are associated with the stimuli. SELECTIVITY REVERSAL IS ASSOCIATED WITH CHOICE OR

REWARD What determines the selectivity dynamics in the class of neurons that followed the textures’ reward contingencies? We envisioned two possibilities. Neurons could persistently report

the upcoming choice5,14,15, independent of reversal learning. Alternatively, neurons could gradually update their texture selectivity during reversal learning, congruous with the associated

reward. The latter neurons would therefore signal the texture value rather than upcoming behavioral choice, as seen in other brain areas8,9,26,27. To address this, we tracked the responses

of the reversibly selective neurons according to the trial outcome (hits, misses, FAs, and CRs) throughout the reversal learning process. We distinguished three learning phases: pre-reversal

expert, post-reversal naïve, and post-reversal expert. Upon reversal of the reward contingencies, some neurons showed persistently larger responses during hit and FA trials as compared with

miss and CR trials (e.g. Neuron 1 in Fig. 6a; Supplementary Fig. 5). Other cells exhibited larger responses in hit and miss trials during the pre-reversal expert phase, then showed larger

responses in FA and CR trials during the naïve post-reversal phase, and finally regained response strength in hit and miss trials during the expert post-reversal phase (e.g. Neuron 2 in Fig.

6a; Supplementary Fig. 5). Thus, whereas the former neuron stably preferred a texture congruent with the final action-selection (i.e. choice) throughout all phases, the latter neuron

updated its selectivity during re-learning, possibly based on the reward-outcome that was associated with the texture (i.e. value). The difference between those two neurons became most

striking during the post-reversal naïve phase in which the animals typically abandoned their previous behavioral strategy and made inconsistent choices. This allowed us to parse out from the

class of reversibly selective neurons those whose selectivity was conforming to the animal’s upcoming choice to lick or not to lick (i.e. choice neurons) or conforming to the texture’s

associated reward value (i.e. value neurons). To quantitatively parse the different types of selectivity, we calculated a choice index (CI) for each neuron. Similar to the DI, this was based

on a ROC curve analysis, but now comparing the response amplitudes between lick and no-lick trials (Fig. 6b). This analysis confirmed the existence of the two subclasses (Fig. 6c), one for

which the CI remained stable throughout the naïve phase after reversal (choice neurons), and one for which the CI was altered (value neurons). For both classes, the calcium signals did not

correlate with the whisking and licking rates (Fig. 6d and Supplementary Fig. 4c, d). In addition, only a few neurons in both classes had previously been categorized as being predictive for

whisking + licking, similar to the other classes of neurons (Fig. 6e). This confirms that the selectivity dynamics (or lack thereof) in choice and value neurons could not be attributed to

alterations in whisking or licking. Altogether, this shows that reversibly selective neurons could be sub-divided into two classes: neurons that signaled the stimulus congruent with the

animal’s upcoming choice and neurons that reported the contextual stimulus value (Fig. 7a). To illustrate the differences between these classes, we provide examples of the temporal evolution

of the DI and CI throughout reversal learning for a choice neuron and a value neuron from the same animal (Fig. 7b). In line with our previous analysis (Fig. 6c), the DI of both neurons

showed a relatively similar temporal profile, with an initial drop after reversal and a gradual inversion during re-learning. On the other hand, the CI of the choice neuron remained positive

throughout the reversal learning phases, whereas the CI of the value neuron did not. Notably, for the value neuron the inversion of the DI seemingly occurred tens of trials before the

animal’s performance started to increase, whereas for the choice neuron the inversion coincided sharply with the increase in performance. VALUE NEURONS DISPLAY ERROR HISTORY ACTIVITY DURING

LEARNING Based on the preceding observations, we hypothesized that the gradual reacquisition of texture preference by the value neurons carries a signal that predicts the upcoming

improvement in the animal’s texture discrimination performance. Such a signal might consist of distinct response amplitudes during certain trials, which could depend on whether the animal

had previously made correct or incorrect choices26,28,29,30. Previous work suggests that a correct trial that follows an incorrect trial is considered more instructive for the animal than

two consecutive correct trials8,9,26. To test this, we focused our analysis on those consecutive trials in which mice were actively licking upon texture presentation (i.e. hits and FAs),

hence ensuring that they were engaged in the task. We compared the mean response amplitudes of hit trials that were preceded by a FA trial (\(R_{{\mathrm{hit}}\left(

{{\mathrm{post}}\,{\mathrm{FA}}} \right)}\)) to those that were preceded by a hit trial ($R_{{\mathrm{hit}}\left( {{\mathrm{post}}\,{\mathrm{hit}}} \right)}$) (Fig. 8a). All trials across

mice were aligned to the point at which the reversal learning had reached the expert criterion (Fig. 8b, Supplementary Fig. 6, and Supplementary Table 1). Averaged hit and FA rates over a

200-trial rolling window separated from one another at ~140 trials before the expert criterion. This point indicated the moment at which mice started to improve their performance, which we

defined as the learning onset (Fig. 8c, black arrow head). For non-selective neurons as well as choice, texture, gained, and lost selectivity neurons, we did not observe any difference

between the $R_{{\mathrm{hit}}\left( {{\mathrm{post}}\,{\mathrm{hit}}} \right)}$ amplitudes and the $R_{{\mathrm{hit}}\left( {{\mathrm{post}}\,{\mathrm{FA}}} \right)}$ amplitudes (Fig.

8d). In contrast, for the contextual value neurons, the average $R_{{\mathrm{hit}}\left( {{\mathrm{post}}\,{\mathrm{FA}}} \right)}$ response amplitudes became larger than the

$R_{{\mathrm{hit}}\left( {{\mathrm{post}}\,{\mathrm{hit}}} \right)}$ amplitudes, at ~260 trials before the expert criterion, and ~120 trials before learning onset (Fig. 8c, d, red

arrowhead). The two types of responses became similar again when mice performed above the expert criterion. During this interval, we did not observe a change in the sampling strategy of the

texture confirming that the difference in responses is not associated with changes in licking and/or whisking rates (Supplementary Fig. 7). We used the normalized difference between

$R_{{\mathrm{hit}}\left( {{\mathrm{post}}\,{\mathrm{FA}}} \right)}$ and $R_{{\mathrm{hit}}\left( {{\mathrm{post}}\,{\mathrm{hit}}} \right)}$ responses as an error history index (Fig.

8e), and observed that a large fraction of the value neurons exhibited a transient increase in the error history as compared to the other neuronal classes that we had identified. Such an

anticipation of the learning onset could not be deduced from the DI evolution (Supplementary Fig. 8). Altogether, these results indicate that the change in texture preference of value

neurons caries a signal that is indicative of the upcoming improvement in discrimination, i.e. learning. DISCUSSION Previous studies indicate that reward-based perceptual learning increases

the reliability and selectivity of neuronal responses in primary sensory cortices. As a consequence, the neuronal population that represents the relevant sensory stimuli stabilizes, which

may improve perception13,16,17,18,31. We extend on this work by tracking the stimulus feature selectivity of neurons in mouse S1, first during learning of a Go/No-go texture discrimination

task (Fig. 1), and subsequently upon reversal learning of the task (Figs. 4 and 5). We found that during learning, the population of neurons displaying selectivity for the rewarded texture

became increasingly larger than for the non-rewarded texture (Fig. 1h–j). This finding agrees with previous studies describing that response selectivity is shaped by the behavioral choice of

the animal5,11,14,15. Using the reversal learning paradigm we then showed that whereas a small population of neurons can stably encode a texture, a large fraction loses, gains, or first

loses and then regains selectivity for a texture when the reward contingencies are reversed (Figs. 4 and 5). This implies that a simple alteration of reward contingencies can disorganize a

pre-established selectivity map in S1, which is then extensively reshaped with relearning. The reshaping of this map could be the result of plasticity mechanisms that also underlie the

experience-dependent tuning of neuronal response properties in primary sensory cortex20. In this case, Hebbian plasticity may drive the phenomenon, with the result that similarly tuned

neurons become more strongly interconnected23,24,25. This is supported by our finding that both, neurons that gain selectivity and those that lose selectivity show higher co-fluctuation in

spontaneous activity during the time they are selective (Fig. 5e). The reshaping of the selectivity map during reversal learning is remarkable, since the lower-order sensory features that

are embodied in the textures had not changed. Thus, in principle, the capacity of the S1 neuronal population to discriminate those lower-order sensory features did not need to be modified in

order for the mouse to resolve the altered reward contingencies. Nonetheless, the finding is congruent with the idea that learning continuously optimizes sensory representations in cortex,

and that this strongly depends on the stimulus context2,13,14. In our study, the reward contingency could represent an important aspect of the context that modulates sensory representations.

Indeed, the selectivity dynamics in the neuronal population upon reversal learning suggests that neurons in S1 do not solely represent lower-order sensory features. Instead, they seem to

selectively report the association between a lower-order stimulus feature and a paired higher-order feature, such as the reward. In Go/No-go tasks the reward is tightly coupled to the

animal’s choice for licking. Thus, the selectivity for the texture-reward coupling could merely represent the encoding of the upcoming behavioral choice. The reversal learning paradigm

allowed us to assess the stability of the neuronal responses for this coupling, e.g. whether the initial P120-selective neurons stably respond to the animal’s choice, even during the

post-reversal naïve phase, or whether they lose selectivity shortly upon reversal and then re-built it with relearning (Figs. 6 and 7). We found that more than half of the P120-selective

neurons belonged to the latter class. Thus, their sensory responses were transiently uncoupled from the animal’s choice, and primarily depended on whether the presented texture was

associated with the upcoming reward (or not), i.e. the value of the texture. In future experiments it will be interesting to test whether repeated reversal learning continues to renew the

selective population, or whether the population reverts back to the original response configuration. Previous studies indicate that the value of a sensory stimulus is encoded by higher-order

areas such as the posterior parietal, orbitofrontal, and retrosplenial cortices8,9,26,27. Our data shows that value-encoding is also an attribute of a population of neurons in S1. The

instructive cues for this selectivity could be manifold. For example, they could be provided by direct feedback from the aforementioned higher-order cortical areas, or they could be derived

from sub-cortical areas that are implicated in attention and behavioral updating during learning32,33. Modulatory reinforcement signals that are associated with behavioral outcome could also

play a major role33,34,35. Indeed, reward-related response modulation has been observed in S128, and was found to promote cortical plasticity processes related to visual response tuning in

primary visual cortex16. We found that the value neurons gradually regained their preference for the rewarded texture with relearning, which would be congruent with the idea that

reward-related plasticity mechanisms contribute to shaping perceptual representations in cortex. At this point it is not clear if the value neurons constitute a specific subpopulation of

L2/3 neurons. Since we used an AAV expression cassette with a generic promoter, the population of value neurons could theoretically contain interneurons. L2/3 of S1 contains various types of

interneurons of which vasoactive intestinal peptide (VIP)-positive interneurons have been shown to be implicated in shaping neuronal responses35,36,37,38 and cortical plasticity39,40. It is

tempting to speculate that the reward-related response modulation that we observed is conveyed by VIP interneurons33,41. We also found that value neurons transiently displayed enhanced

response amplitudes dependent on the animal’s behavioral error history (Fig. 8). During the naïve reversal phase these neurons showed higher responses in hit trials if the hit trial was

preceded by a false alarm trial. This phenomenon was prominent during the transition from the naïve to expert reversal phase and forecasted the increase in behavioral performance. We

speculate that the omission of reward-associated signals during a false alarm trial directs the animal’s attention towards the newly rewarded texture. Elevated attentional signals have been

shown to modulate sensory-driven responses in visual cortex42. Thus, the attentional signals may be read out by the value neurons, which in turn reshape the texture selectivity of

surrounding neurons. Together, this may enhance sensory perception. METHODS ANIMALS C57BL/6J male mice (Janvier Labs) aged 6 weeks were group housed on a 12-h light cycle (lights on at 8:00

a.m.) with littermates until surgery. Two weeks after surgery, mice kept under standardized conditions at the animal facility of the university of Geneva, with an inverted light-dark cycle

7–8 days before the first training session. The behavioral experiments were performed during the dark phase. All procedures were conducted in accordance with the guidelines of the Federal

Food Safety and Veterinary Office of Switzerland and in agreement with the veterinary office of the Canton of Geneva (licence numbers GE/28/14, GE/61/17, and GE/74/18). C57BL/6J male mice

(Janvier Labs) aged 6 weeks were group housed on a 12-h light cycle (lights on at 8:00 a.m.) with littermates until surgery. Two weeks after surgery, animals were housed under standard

conditions, with an inverted light–dark cycle 7–8 days before the first training session. SURGERY AND INTRINSIC OPTICAL IMAGING Stereotaxic injections of adeno-associated viral (AAV) vectors

were carried out on 6-week-old male C57BL/6 mice. A mix of O2 and 4% isoflurane at 0.4 L min−1 was used to induce anesthesia followed by an intraperitoneal injection of MMF solution,

consisting of 0.2 mg kg−1 medetomidine (Dormitor, Orion Pharma), 5 mg kg−1 midazolam (Dormicum, Roche), and 0.05 mg kg−1 fentanyl (Fentanyl, Sinetica) diluted in sterile 0.9% NaCl.

AAV1-hSyn1-mRuby2-GSG-P2A-GCaMP6s (Penn Vector Core; 100 nl)20 was delivered to L2/3 of the right barrel cortex in S1 at the approximate location of the C2 barrel-related column (1.4 mm

posterior, 3.5 mm lateral from bregma, 300 µm below the pia). For long-term in vivo calcium imaging, a 3-mm diameter cranial window was implanted, as described previously43. Two weeks after

surgery, the C2 barrel column was mapped again using intrinsic optical imaging to confirm the location of mRuby2/GCaMP6s expression. To do this, a mix of O2 and 4% isoflurane at 0.4 L min−1

was used to induce anesthesia followed by an intraperitoneal injection of MM solution consisting of 0.2 mg kg−1 medetomidine and 5 mg kg−1 midazolam diluted in sterile 0.9% NaCl. The C2

whisker was inserted into a capillary connected to a piezo actuator. Intrinsic signal was collected during repeated whisker stimulation (1 s at 8 Hz). A 100-W halogen light source connected

to a light guide system with a 700-nm interference filter was used to illuminate the cortical surface through the cranial window. Reflectance images 300 µm below the surface were acquired

using a ×2.7 objective and the Imager 3001F (Optical Imaging, Mountainside, NJ) equipped with a 256 × 256 pixels array charge-coupled device (CCD) camera (using VDaq software). The built-in

Imager 3001F analysis program (Winmix software) was used to visualize the responses and produce an intrinsic signal image by dividing the stimulus signal by the pre-stimulus baseline signal.

An image of the vasculature was then acquired using a 546-nm interference filter, and superimposed on the intrinsic signal image. This reference image was used later to select an

appropriate field of view (FOV) using 2PLSM. After this procedure, a metal post was implanted laterally to the window using dental acrylic to restrict head movement during behavior and

imaging. HABITUATION AND WATER DEPRIVATION Mice were handled and accustomed to be head restrained on the training setup for 10–15 min over 4–5 days. Water deprivation started 3–5 days before

the first training session and discontinued at the end of the training. Weight was monitored daily during this period and the amount of water given was adjusted to prevent them from losing

more than 15% of their original weight. Altogether, mice received a minimum of 1 ml of water per day corresponding to the amount they drank during the training as rewards plus the amount

that the experimenter provided outside of the training sessions. TEXTURE DISCRIMINATION TASK Mice were trained to discriminate between two commercial-grade sandpapers (P120 and P280) in a

Go/No-go paradigm as described previously5. The control of the devices and the recording of behavioral parameters were performed using a data acquisition interface (PCI 6503, National

Instruments) and custom-written LabWindows/CVI software (National Instruments). Licks were detected electrically. Mice remained on a metallic plate that maintained an electrical potential

difference with the licking spout. The electrical circuit was closed when mice touched the spout with their tongue, producing a 1.2-µA current that was detected by the acquisition interface.

Whisking activity was measured with an optical barrier that detected the changes in intensity when whiskers swept through. To achieve this, an 850-nm LED beam was used as light source

(HIR204C, Everlight Electronics) and an 860-nm phototransistor (PT 202C, Everlight Electronics) was used to detect intensity variations through a 1-mm hole placed 60 mm away from the light

source, at a sampling frequency of 10 kHz. Whisking activity was quantified as the frequency at which individual whiskers crossed the light beam placed ~1 mm in front of and centered on the

presented texture. The licking and whisking rates were calculated as the average number of events over a sliding window of 100 ms and normalized per second. Sandpapers were attached to a

four-arm wheel (2 × 2 of the same sandpapers) mounted on a stepper motor (T-NM17A04, Zaber) and a motorized linear stage (T-LSM100A, Zaber) to move textures in and out of reach of the

whiskers. At the start of each trial, the wheel spun for a random amount of time while in the rear position of the linear stage (approximately between 0.5 and 1 s) and stopped between two

textures positions. To present a texture the linear stage first moved to the front position and then the stepper motor rapidly slid the sandpaper into the whisker’s reach at ~15 mm from the

snout with an angle of 70° relative to the rostro-caudal axis. In the first phase of the training, the coarser P120 sandpaper was the rewarded texture (i.e. the target stimulus for which the

mouse was incited to lick the spout in order to receive a water reward) and the P280 sandpaper was the non-rewarded texture (i.e. the non-target stimulus for which the mouse was incited to

refrain from licking the spout). Initially, mice were trained to trigger a 4–6-µl sucrose water reward (100 mg/ml) by licking the spout during the presentation of the P120 texture

(rewarded). Then, they were gradually familiarized with the P280 texture presentation (non-rewarded), from 0 to 30% of the trials, within two sessions (one session per day, 150–300 trials

per session). Imaging started when P120 and P280 textures were pseudo-randomly presented with 50% probability for each trial type with a maximum of four consecutive presentations of the same

stimuli. A trial consisted of a 1-s pre-stimulus period followed by a 3-kHz auditory cue for 200 ms, a delay period of 500 ms after which the texture reached the whiskers within 150 ms and

remained there for 2 s before being retracted. Licking during the P120 texture presentation triggered a water reward at the end of the 2-s presentation, and the corresponding trial was

scored as a ‘hit’. Licking during the P280 texture presentation triggered a 500-ms white noise sound exposure at the end of the 2-s presentation plus a 5-s time-out period, and the trial was

scored as a ‘false alarm’ (FA). In the absence of a lick during stimulus presentation, trials were scored as a ‘miss’ or a ‘correct rejection’ (CR) for P120 and P280 stimuli, respectively.

To prevent the mice from compulsive licking during training, in addition to the aforementioned rules, mice had to show a 2-fold increase in the licking rate during stimulus presentation as

compared with the pre-stimulus baseline period to get rewarded on the P120 texture presentation. Around 250–400 trials per session were performed (1 session per day) at a rate of ~6

trials/min. The overall performance of the animal was calculated as the percentage of correct trials (hits + CRs) over an entire session or over a sliding window of 200 trials. The hit and

FA rates were calculated as _N_hit/(_N_hit+_N_miss) and _N_FA/(_N_FA+_N_CR) respectively where _N_ is a number of trials for an entire session or over a sliding window of 200 trials. Mice

were considered experts when the average performance per session reached a level of 70% correct trials (the expert criterion) over two consecutive sessions. In the second phase of training

(i.e. reversal learning), reward contingencies were inverted (i.e. the P280 texture was rewarded whereas the P120 texture was not) and mice were trained until they reached the same expert

criterion again in two consecutive sessions. 2PLSM We used a custom built 2-photon laser scanning microscope mounted onto a modular in vivo multiphoton microscopy system

(https://www.janelia.org/open-science/mimms-10-2016) equipped with an 8-kHz resonant scanner and a ×16 0.8NA objective (Nikon, CFI75), and controlled with Scanimage 2016b44

(http://www.scanimage.org). Fluorophores were excited using a Ti:Sapphire laser (Chameleon Ultra, Coherent) tuned to _λ_ = 980 m that was slightly underfilling the back aperture of the

objective to extend the depth of field to 5 µm. Fluorescent signals were collected with GaAsP photomultiplier tubes (10770PB-40, Hamamatsu) separating mRuby2 and GCaMP6s signals with a

dichroic mirror (565dcxr, Chroma) and emission filters (ET620/60 m and ET525/50 m, respectively, Chroma). Fast volumetric imaging was performed at 11.5 Hz using a piezo z-scanner (P-725

PIFOC, Physik Instrumente) for moving the objective over the _z_-axis. Each acquisition volume consisted of 5 contiguous planes (with 5-µm steps between planes) of 400 × 400 µm (512 × 256

pixels) allowing post-hoc z-motion correction which may be generated by licking-induced brain motion artifacts21. IMAGE PROCESSING Images were processed using custom-written MATLAB scripts

and ImageJ (http://rsbweb.nih.gov/ij/). Lateral and axial motion corrections were performed using the mRuby2 signal as a reference. First, rigid lateral movement vectors were calculated

based on individual trial movies from the average z-projection of the 20-µm imaged volumes using the NoRMCorre MATLAB toolbox45. Residual bidirectional scanning artifact vectors were

calculated using a highest-pixel-line signal correlation between the two scanning directions on the entire frame. Inter-trial registration was calculated using a custom-written

cross-correlation algorithm based on the rigid image stack registration plugin in ImageJ. All calculated lateral motion corrections were applied on both the mRuby2 and GCaMP6s signals.

Second, axial motion correction was performed using cross-correlation on linearly interpolated volumes (with a factor 3). The image planes with the highest correlation to a reference image,

defined as the center image plane of the first volume, were selected. For an unbiased extraction of the GCaMP6s fluorescence signals from individual neurons, regions of interest (ROIs) were

drawn manually for each session based on neuronal shape using the mRuby2 signal. The fluorescence time-course of each neuron (_F_measured) was measured as the average of all pixel values of

the GCaMP6s signal within the ROI. Local neuropil signal (_F_neuropil) was measured for each ROI as the average of pixel values within an automatically defined ring of 15 µm width, 2 µm away

from the ROI (excluding overlap with surrounding ROIs)46. The fluorescence signal of a cell body was then estimated as \(F\left( t \right) = F_{{\mathrm{measured}}}\left( t \right) - r

\times F_{{\mathrm{neuropil}}}\left( t \right)\) with _r_ = 0.747. Residual trends were removed by subtracting the 8th percentile of each trial48. Normalized calcium traces Δ_F_/_F_0 were

calculated as (_F_−_F_0)/_F_0, where _F_0 is the median of the individual mean baseline fluorescence signal of each trial over a 1-s period before the start of the stimulation. For

individual stimulation sessions (see Individual stimulation session and neuron categorization section) and spontaneous activity recordings, _F_0 is the 30th percentile of each trial trace.

For display, traces were additionally filtered with a Savitzky-Golay function (2nd order, 500-ms span). ACTIVITY ONSET ANALYSIS Normalized calcium traces (Δ_F_/_F_0) were aligned to either

the onset of the texture presentation or to the first lick during the texture presentation for each neuron across all hit trials of an expert session. For both realignments, the onset of the

neuronal response was calculated as the time, relative to the texture or first lick onset, at which the average of the response reached half of its maximum amplitude. INDIVIDUAL STIMULATION

SESSION AND NEURON CATEGORIZATION Prior to the start of the training, nine mice were imaged in the experimental training configuration, where task-related stimuli were presented

independently of one another in a pseudo-random fashion. Data acquisition was organized in trials of 10 s, each starting with a 3-s baseline after which one of the following conditions was

presented at a random time within a 4-s window: 2-s texture, 0.2-s sound (auditory cue) or water valve opening to incite licking, and finishing with another 3-s of recording. In 20% of the

trials, no stimulation was applied. Whisking and licking events were recorded over the course of the session. To determine if neuronal activity was significantly modulated by texture or

sound stimuli, we compared, for each neuron across trials, the average normalized fluorescence over 1 s before and after the stimulus onset using a paired-sample _t_-test at a significance

threshold of 5%. To account for noise in our data due to possible stimulation-induced movement artefacts, we performed the same test using the mRuby2 signal. None of the neurons showed a

significant change in mRuby2 signal upon texture and sound stimulation. We used a random forests machine-learning algorithm to decode behavioral features (licking and whisking rates) from

the activity of single neurons. This procedure allowed us to categorize single neurons as either decoding whisking, licking, or both. Given the slow kinetics of calcium transients captured

by the GCaMP6s sensor, spiking rates were inferred from the Δ_F_/_F_0 trace and used as input to the algorithm, which allowed to temporally match behavioral event variations (i.e. whisking

or licking rates) to neuronal activity. Firing rates at each imaging frame were inferred from normalized calcium traces (Δ_F_/_F_0) using a fast nonnegative deconvolution method

(https://github.com/jovo/oopsi)49 with variable background fluorescence estimation and a _K__d_ of 144 nM50. In order for the algorithm to capture differences in activity levels between

neurons, all trial traces of all neurons recorded per mouse were concatenated before inferring spikes. To account for putatively preceding pre-motor and/or following sensory-related activity

in S1 relative to behavioral events, the neuronal activity traces were shifted negatively and positively in time with a maximum shift of 500 ms. Eleven time bins of inferred firing rates

(discretized in time bins of 100 ms) centered on zero time-shift were used to predict instantaneous behavioral features and composed a vector \(X_i\left( t \right) = \left[ {x_i\left( {t -

500\,{\mathrm{ms}}} \right), \ldots x_i\left( t \right), \ldots ,x_i\left( {t + 500\,{\mathrm{ms}}} \right)} \right]\) where _x__i_(_t_) represents the inferred firing rates of the _i_th

neuron at zero time-shift. Licking and whisking rates were down sampled to 11.5 Hz in order to temporally match calcium imaging data. The ranger function of the ranger R package version

0.10.1 was used to construct regression forests, with each behavioral feature as dependent variable and the binned inferred firing rates of a given neuron as predictors. For each neuron, two

regression forests were constructed, one to decode whisking and the other licking. Most arguments of the function were kept at default settings, except the following: the number of trees

was set to 128, the minimum size of terminal nodes was set to 2, the number of predictor variables randomly sampled at each node split was set to the maximum between 1 or the third of the

number of predictors, and the variable importance mode was set to “impurity”. To obtain a prediction for all trials, 5-fold cross-validation was applied by training the algorithm on 80% of

the trials (i.e. training set) and evaluating it on the remaining 20% of the trials (i.e. test set). Since data acquisition was discretized by trial, for each cross-validation the training

and test set trials were concatenated for training and prediction, respectively. For each neuron and for each behavioral feature, the decoding accuracy was assessed by computing the

Pearson’s product-moment correlation coefficient between the observed and predicted behavioral event fluctuations. In order to get an estimate of the noise in the prediction levels, the same

analysis was performed using the mRuby2 signal as a control. Neurons were classified as decoding a given behavioral feature if their Pearson’s correlation coefficient computed on the

GCaMP6s signal was five standard deviations away from the mean of the Pearson’s correlation coefficients for all neurons computed on the mRuby2 signal. Neurons meeting these criteria for

both whisking and licking were classified as decoding both behavioral features. SPONTANEOUS ACTIVITY CORRELATION Spontaneous calcium transients were recorded for 10 min after mice reached

the expert level before and after texture reversal. Pairwise Pearson’s correlation coefficients were calculated on the normalized calcium traces. DISCRIMINATION AND CHOICE INDICES The

selectivity of each neuron was expressed by a Discrimination index (DI) that was calculated based on neurometric functions using a receiver-operating characteristic (ROC) analysis22,51,52.

Normalized mean calcium signals (Δ_F_/_F_0) during the 2-s stimulus presentations in the P120 texture trials were compared to the P280 texture trials. ROC curves were generated by plotting,

for all threshold levels, the fraction of P120 trials against the fraction of P280 trials for which the response exceeded threshold. Threshold levels were defined as a linear function from

the minimal to the maximal calcium signals. DI was computed from the area under the ROC curve (AUC) as follows: DI = (AUC−0.5) × 2. DI values vary between −1 and 1. Positive or negative

values indicate larger or smaller responses to P120 than to P280 texture presentations, respectively. Statistical significance of the measured DI value was assessed by performing a

permutation test, from which a sampling distribution was obtained by shuffling the texture labels of the trials 10,000 times. The measured DI was considered significant when it was outside

of the 2.5th–97.5th percentiles interval of the sampling distribution. For the choice index (CI), the same calculation was performed, with the difference that trials in which the animal

licked during the texture presentation were compared to trials with no lick. For building the temporal evolution of the DI and CI across reversal learning, both indices were calculated over

a sliding window of 100 trials every 5 trials. CALCIUM SIGNALS RELATIVE TO BEHAVIORAL STRATEGIES For all hit trials of an expert session, average whisking and licking rates were calculated

as the average number of events over the entire texture presentation window. For each mouse, the median value in both distributions was used to separate low and high whisking or licking rate

trials. ERROR HISTORY Error history for each neuron was calculated as the normalized difference between the average calcium signal during hit trials ($\bar R$) over a sliding window of

200 trials as follows: $${\mathrm{Error}}\,\,\,{\mathrm{history}}\left( t \right) = \frac{{\bar R_{{\mathrm{hit}}\left( {{\mathrm{post}}\,{\mathrm{FA}}} \right)}\left( {t - 100:t + 100}

\right) - \bar R_{{\mathrm{hit}}\left( {{\mathrm{post}}\,{\mathrm{hit}}} \right)}\left( {t - 100:t + 100} \right)}}{{\bar R_{{\mathrm{hit}}}\left( {t - 100:t + 100} \right)}}$$ where

$R_{{\mathrm{hit}}\left( {{\mathrm{post}}\,{\mathrm{FA}}} \right)}$ is the calcium signal in a hit trial that was preceded by a FA trial, and \(R_{{\mathrm{hit}}\left(

{{\mathrm{post}}\,{\mathrm{hit}}} \right)}\) is a calcium signal in a hit trial that was preceded by another hit trial. _R_hit is the calcium signal in any hit trial, and _t_ is the trial

number, relative to the trial at which behavioral performance reaches the expert criterion. To estimate the fraction of neurons with an error history above chance, all hit trials within each

window of 200 trials were randomly permuted for each neuron, replacing $R_{{\mathrm{hit}}\left( {{\mathrm{post}}\,{\mathrm{FA}}} \right)}$,

$R_{{\mathrm{hit}}\left({{\mathrm{post}}\,{\mathrm{hit}}} \right)}$, and _R_hit in their respective trial positions. Then, an error history value was calculated based on the permuted data

set. This process was repeated 1000 times to obtain 95% confidence intervals for each observed error history value. IMMUNOHISTOCHEMISTRY Post-hoc immunohistochemistry of GABA was performed

on mRuby2/GCaMP6s-expressing neurons. In all, 100-µm-thick tangential sections were produced using a vibratome (Leica VT 1000). The sections were washed 3 × 3 min in 500 µl Tris-buffered

saline (0.1 M Tris, 150 mM NaCl) containing 0.1% Tween (TBST), then pre-treated with TBST and 0.1% Triton-X for 20 min followed by a 3 × 3 min TBST wash. They were blocked in 300 µl TBST

containing 10% normal donkey serum (ab7475, Abcam) for 1 h and incubated with mouse anti-GABA antibody (ab86186, Abcam) diluted 1:500 for 72 h at 4 °C. After another 5 × 3 min wash in TBST

they were incubated in 300 µl of donkey anti-mouse antibody coupled to Alexa Fluor 647 (A32787, Thermo Fisher Scientific) diluted 1:200 in TBST for 1 h at room temperature. Finally, they

were washed 10 × 3 min in TBST and then for 1 h in PBS before being mounted onto glass slides. We applied Fluoroshield mounting medium with DAPI (Abcam) before applying the coverslip. The

sections were imaged using a Zeiss Confocal LSM800 Airyscan. S1 INACTIVATION AND WHISKER TRIMMING To inactivate S1, the GABA (G-aminobutyric acid) agonist muscimol was injected in a separate

set of expert mice (_N_ = 5 mice; this data set was also used as a control group for another study53). During the test session, high baseline performance (>70%) was first recorded for

100 trials before the injection was performed. Under light anesthesia (4% isoflurane at 0.4 L min−1), a small hole was drilled through the imaging window above the previously mapped C2

barrel column to provide access to a glass pipette through which 300 nl of Muscimol (Bodipy-TMR-X, 5 mM in cortex buffer with 5% DMSO, Thermo Fisher Scientific) was injected at 300 and 500

µm below the pia. Mice were left to recover for 45 min and their behavioral performance was then assessed for another 100 trials. For the whisker trimming experiment, a similar baseline

performance was first recorded for 100 trials before trimming the whiskers on the side of the snout contralateral to the texture presentations, and tested the performance for 50 trials. This

ensured that trimming itself did not alter performance. Then, the whiskers that were in contact with the textures (ipsilateral to the texture presentation side) were trimmed, and the effect

on task performance was measured for another 50 trials. STATISTICS AND REPRODUCIBILITY All statistics were performed using MATLAB. For all figures, significance levels were denoted as *_P_

< 0.05, **_P_ < 0.01, ***_P_ < 0.001, and ****_P_ < 0.0001. No statistical methods were used to estimate sample sizes. All comparison tests were performed two-sided.

Non-parametric tests were used for sample sizes smaller than 15. For the training experiments, the fields of view across mice were of similar quality and the number of neurons recorded

ranged between 42 and 113. For immunostainings, 2–3 fields of view per mice of similar quality containing 110–201 neurons were analyzed. REPORTING SUMMARY Further information on research

design is available in the Nature Research Reporting Summary linked to this article. DATA AVAILABILITY The data used to generate the figures is freely available at the CERN data repository

Zenodo https://zenodo.org/communities/holtmaat-lab-data/ with https://doi.org/10.5281/zenodo.3824493. CODE AVAILABILITY The principal Matlab code that was used for data analysis is freely

available at the CERN data repository Zenodo https://zenodo.org/communities/holtmaat-lab-data/ with https://doi.org/10.5281/zenodo.3824493. REFERENCES * Holtmaat, A. & Caroni, P.

Functional and structural underpinnings of neuronal assembly formation in learning. _Nat. Neurosci._ 19, 1553–1562 (2016). Article CAS PubMed Google Scholar * Makino, H., Hwang, E. J.,

Hedrick, N. G. & Komiyama, T. Circuit mechanisms of sensorimotor learning. _Neuron_ 92, 705–721 (2016). Article CAS PubMed PubMed Central Google Scholar * Safaai, H., von

Heimendahl, M., Sorando, J. M., Diamond, M. E. & Maravall, M. Coordinated population activity underlying texture discrimination in rat barrel cortex. _J. Neurosci._ 33, 5843–5855 (2013).

Article CAS PubMed PubMed Central Google Scholar * Garion, L. et al. Texture coarseness responsive neurons and their mapping in layer 2-3 of the rat barrel cortex in vivo. _Elife_ 3,

e03405 (2014). Article PubMed PubMed Central Google Scholar * Chen, J. L., Carta, S., Soldado-Magraner, J., Schneider, B. L. & Helmchen, F. Behaviour-dependent recruitment of

long-range projection neurons in somatosensory cortex. _Nature_ 499, 336–340 (2013). Article ADS CAS PubMed Google Scholar * Hubel, D. H. & Wiesel, T. N. Receptive fields of single

neurones in the cat’s striate cortex. _J. Physiol._ 148, 574–591 (1959). Article CAS PubMed PubMed Central Google Scholar * Stiebler, I., Neulist, R., Fichtel, I. & Ehret, G. The

auditory cortex of the house mouse: left-right differences, tonotopic organization and quantitative analysis of frequency representation. _J. Comp. Physiol. A_ 181, 559–571 (1997). Article

CAS PubMed Google Scholar * Akrami, A., Kopec, C. D., Diamond, M. E. & Brody, C. D. Posterior parietal cortex represents sensory history and mediates its effects on behaviour.

_Nature_ 554, 368–372 (2018). Article ADS CAS PubMed Google Scholar * Hwang, E. J., Dahlen, J. E., Mukundan, M. & Komiyama, T. History-based action selection bias in posterior

parietal cortex. _Nat. Commun._ 8, 1242 (2017). Article ADS PubMed PubMed Central CAS Google Scholar * Felleman, D. J. & Van Essen, D. C. Distributed hierarchical processing in the

primate cerebral cortex. _Cereb. Cortex_ 1, 1–47 (1991). Article CAS PubMed Google Scholar * Pho, G. N., Goard, M. J., Woodson, J., Crawford, B. & Sur, M. Task-dependent

representations of stimulus and choice in mouse parietal cortex. _Nat. Commun._ 9, 2596 (2018). Article ADS PubMed PubMed Central CAS Google Scholar * Francis, N. A. et al. Small

networks encode decision-making in primary auditory cortex. _Neuron_ 97, 885–897 e886 (2018). Article CAS PubMed PubMed Central Google Scholar * Poort, J. et al. Learning enhances

sensory and multiple non-sensory representations in primary visual cortex. _Neuron_ 86, 1478–1490 (2015). Article CAS PubMed PubMed Central Google Scholar * Chen, J. L. et al.

Pathway-specific reorganization of projection neurons in somatosensory cortex during learning. _Nat. Neurosci._ 18, 1101–1108 (2015). Article PubMed CAS Google Scholar * Yang, H., Kwon,

S. E., Severson, K. S. & O’Connor, D. H. Origins of choice-related activity in mouse somatosensory cortex. _Nat. Neurosci._ 19, 127–134 (2016). Article CAS PubMed Google Scholar *

Goltstein, P. M., Coffey, E. B., Roelfsema, P. R. & Pennartz, C. M. In vivo two-photon Ca2+ imaging reveals selective reward effects on stimulus-specific assemblies in mouse visual

cortex. _J. Neurosci._ 33, 11540–11555 (2013). Article CAS PubMed PubMed Central Google Scholar * Pecka, M., Han, Y., Sader, E. & Mrsic-Flogel, T. D. Experience-dependent

specialization of receptive field surround for selective coding of natural scenes. _Neuron_ 84, 457–469 (2014). Article CAS PubMed PubMed Central Google Scholar * Schoups, A., Vogels,

R., Qian, N. & Orban, G. Practising orientation identification improves orientation coding in V1 neurons. _Nature_ 412, 549–553 (2001). Article ADS CAS PubMed Google Scholar *

Huber, D. et al. Multiple dynamic representations in the motor cortex during sensorimotor learning. _Nature_ 484, 473–478 (2012). Article ADS CAS PubMed PubMed Central Google Scholar *

Rose, T., Jaepel, J., Hubener, M. & Bonhoeffer, T. Cell-specific restoration of stimulus preference after monocular deprivation in the visual cortex. _Science_ 352, 1319–1322 (2016).

Article ADS CAS PubMed Google Scholar * Chen, J. L., Pfaffli, O. A., Voigt, F. F., Margolis, D. J. & Helmchen, F. Online correction of licking-induced brain motion during two-photon

imaging with a tunable lens. _J. Physiol._ 591, 4689–4698 (2013). Article CAS PubMed PubMed Central Google Scholar * Takahashi, N., Oertner, T. G., Hegemann, P. & Larkum, M. E.

Active cortical dendrites modulate perception. _Science_ 354, 1587–1590 (2016). Article ADS CAS PubMed Google Scholar * Ko, H. et al. Functional specificity of local synaptic

connections in neocortical networks. _Nature_ 473, 87–91 (2011). Article ADS CAS PubMed PubMed Central Google Scholar * Cossell, L. et al. Functional organization of excitatory

synaptic strength in primary visual cortex. _Nature_ 518, 399–403 (2015). Article ADS CAS PubMed PubMed Central Google Scholar * Cohen, M. R. & Kohn, A. Measuring and interpreting

neuronal correlations. _Nat. Neurosci._ 14, 811–819 (2011). Article CAS PubMed PubMed Central Google Scholar * Hattori, R., Danskin, B., Babic, Z., Mlynaryk, N. & Komiyama, T.

Area-specificity and plasticity of history-dependent value coding during learning. _Cell_ 177, 1858–1872.e15 (2019). Article CAS PubMed PubMed Central Google Scholar * Sul, J. H., Kim,

H., Huh, N., Lee, D. & Jung, M. W. Distinct roles of rodent orbitofrontal and medial prefrontal cortex in decision making. _Neuron_ 66, 449–460 (2010). Article CAS PubMed PubMed

Central Google Scholar * Lacefield, C. O., Pnevmatikakis, E. A., Paninski, L. & Bruno, R. M. Reinforcement learning recruits somata and apical dendrites across layers of primary

sensory cortex. _Cell Rep._ 26, 2000–2008.e2 (2019). Article CAS PubMed PubMed Central Google Scholar * Campagner, D. et al. Prediction of choice from competing mechanosensory and

choice-memory cues during active tactile decision making. _J. Neurosci._ 39, 3921–3933 (2019). Article PubMed PubMed Central Google Scholar * Banerjee A. et al. Value-guided remapping of

sensory circuits by lateral orbitofrontal cortex in reversal learning. _BioRxiv_ https://doi.org/10.1101/2020.03.11.982744 (2019). * Peron, S. P., Freeman, J., Iyer, V., Guo, C. &

Svoboda, K. A cellular resolution map of barrel cortex activity during tactile behavior. _Neuron_ 86, 783–799 (2015). Article CAS PubMed Google Scholar * Wimmer, R. D. et al. Thalamic

control of sensory selection in divided attention. _Nature_ 526, 705–709 (2015). Article ADS CAS PubMed PubMed Central Google Scholar * Roelfsema, P. R. & Holtmaat, A. Control of

synaptic plasticity in deep cortical networks. _Nat. Rev. Neurosci._ 19, 166–180 (2018). Article CAS PubMed Google Scholar * Watabe-Uchida, M., Eshel, N. & Uchida, N. Neural

circuitry of reward prediction error. _Annu. Rev. Neurosci._ 40, 373–394 (2017). Article CAS PubMed PubMed Central Google Scholar * Pi, H. J. et al. Cortical interneurons that

specialize in disinhibitory control. _Nature_ 503, 521–524 (2013). Article ADS CAS PubMed PubMed Central Google Scholar * Lee, S., Kruglikov, I., Huang, Z. J., Fishell, G. & Rudy,

B. A disinhibitory circuit mediates motor integration in the somatosensory cortex. _Nat. Neurosci._ 16, 1662–1670 (2013). Article CAS PubMed PubMed Central Google Scholar * Garrett, M.

et al. Experience shapes activity dynamics and stimulus coding of VIP inhibitory cells. _Elife_ 9, e50340 (2020). Article PubMed PubMed Central CAS Google Scholar * Pfeffer, C. K., Xue,

M., He, M., Huang, Z. J. & Scanziani, M. Inhibition of inhibition in visual cortex: the logic of connections between molecularly distinct interneurons. _Nat. Neurosci._ 16, 1068–1076

(2013). Article CAS PubMed PubMed Central Google Scholar * Williams, L. E. & Holtmaat, A. Higher-order thalamocortical inputs gate synaptic long-term potentiation via disinhibition.

_Neuron_ 101, 91–102.e4 (2019). Article CAS PubMed Google Scholar * Fu, Y., Kaneko, M., Tang, Y., Alvarez-Buylla, A. & Stryker, M. P. A cortical disinhibitory circuit for enhancing

adult plasticity. _Elife_ 4, e05558 (2015). Article PubMed PubMed Central Google Scholar * Khan, A. G. & Hofer, S. B. Contextual signals in visual cortex. _Curr. Opin. Neurobiol._

52, 131–138 (2018). Article CAS PubMed Google Scholar * Pooresmaeili, A., Poort, J. & Roelfsema, P. R. Simultaneous selection by object-based attention in visual and frontal cortex.

_Proc. Natl Acad. Sci. USA_ 111, 6467–6472 (2014). Article ADS CAS PubMed PubMed Central Google Scholar * Holtmaat, A. et al. Long-term, high-resolution imaging in the mouse neocortex

through a chronic cranial window. _Nat. Protoc._ 4, 1128–1144 (2009). Article CAS PubMed PubMed Central Google Scholar * Pologruto, T. A., Sabatini, B. L. & Svoboda, K. ScanImage:

flexible software for operating laser scanning microscopes. _Biomed. Eng. Online_ 2, 13 (2003). Article PubMed PubMed Central Google Scholar * Pnevmatikakis, E. A. & Giovannucci, A.

NoRMCorre: an online algorithm for piecewise rigid motion correction of calcium imaging data. _J. Neurosci. Methods_ 291, 83–94 (2017). Article CAS PubMed Google Scholar * Kerlin, A. M.,

Andermann, M. L., Berezovskii, V. K. & Reid, R. C. Broadly tuned response properties of diverse inhibitory neuron subtypes in mouse visual cortex. _Neuron_ 67, 858–871 (2010). Article

CAS PubMed PubMed Central Google Scholar * Chen, T. W. et al. Ultrasensitive fluorescent proteins for imaging neuronal activity. _Nature_ 499, 295–300 (2013). Article ADS CAS PubMed

PubMed Central Google Scholar * Dombeck, D. A., Khabbaz, A. N., Collman, F., Adelman, T. L. & Tank, D. W. Imaging large-scale neural activity with cellular resolution in awake, mobile

mice. _Neuron_ 56, 43–57 (2007). Article CAS PubMed PubMed Central Google Scholar * Vogelstein, J. T. et al. Fast nonnegative deconvolution for spike train inference from population

calcium imaging. _J. Neurophysiol._ 104, 3691–3704 (2010). Article PubMed PubMed Central Google Scholar * Lock, J. T., Parker, I. & Smith, I. F. A comparison of fluorescent Ca(2)(+)

indicators for imaging local Ca(2)(+) signals in cultured cells. _Cell Calcium_ 58, 638–648 (2015). Article CAS PubMed PubMed Central Google Scholar * Stuttgen, M. C. & Schwarz, C.

Psychophysical and neurometric detection performance under stimulus uncertainty. _Nat. Neurosci._ 11, 1091–1099 (2008). Article PubMed CAS Google Scholar * Britten, K. H., Shadlen, M.

N., Newsome, W. T. & Movshon, J. A. The analysis of visual motion: a comparison of neuronal and psychophysical performance. _J. Neurosci._ 12, 4745–4765 (1992). Article CAS PubMed

PubMed Central Google Scholar * Vecchia, D. et al. Temporal sharpening of sensory responses by layer V in the mouse primary somatosensory cortex. _Curr. Biol._ 30, 1589–1599.e10 (2020).

Article CAS PubMed PubMed Central Google Scholar Download references ACKNOWLEDGEMENTS We thank Ariel Gilad for advice on the behavioral paradigm; Fritjof Helmchen and Abhishek Banerjee

for discussions and suggesting the error history analysis; Jose Manuel Nunes for advice on assessment of selection criteria in the prediction model; Sebastien Pellat for technical support

and engineering; Laura Bussien and Elodie Husi for assistance with histology; Tobias Rose for making available the mRuby-GCaMP constructs; Sonja Hofer for sharing viral vectors; and Pieter

Roelfsema for useful comments on the manuscript. This project was supported by the Swiss National Science Foundation (grants 31003A-153448, 31003A_173125, CRSII3_154453, and NCCR Synapsy

51NF40-158776), and a gift from a private foundation with public interest through the International Foundation for Research in Paraplegia. AUTHOR INFORMATION AUTHORS AND AFFILIATIONS *

Department of Basic Neurosciences and the Center for Neuroscience, CMU, University of Geneva, Rue Michel Servet 1, 1211, Geneva, Switzerland Ronan Chéreau, Tanika Bawa, Leon Fodoulian, Alan

Carleton, Stéphane Pagès & Anthony Holtmaat * Lemanic Neuroscience Doctoral School, University of Geneva, Geneva, Switzerland Tanika Bawa & Leon Fodoulian Authors * Ronan Chéreau

View author publications You can also search for this author inPubMed Google Scholar * Tanika Bawa View author publications You can also search for this author inPubMed Google Scholar * Leon

Fodoulian View author publications You can also search for this author inPubMed Google Scholar * Alan Carleton View author publications You can also search for this author inPubMed Google

Scholar * Stéphane Pagès View author publications You can also search for this author inPubMed Google Scholar * Anthony Holtmaat View author publications You can also search for this author

inPubMed Google Scholar CONTRIBUTIONS R.C., S.P., and A.H. designed the experiments. R.C. and T.B. performed the experiments. S.P. and R.C. designed and built experimental setups and

analysis software. R.C. analyzed the data and L.F. performed random forests modeling. A.C. and A.H. provided equipment and technical expertize. A.H. supervised the research. R.C. and A.H.

wrote the manuscript. All of the authors edited the manuscript. CORRESPONDING AUTHOR Correspondence to Anthony Holtmaat. ETHICS DECLARATIONS COMPETING INTERESTS The authors declare no

competing interests. ADDITIONAL INFORMATION PEER REVIEW INFORMATION _Nature Communications_ thanks Jerry Chen and the anonymous reviewers for their contribution to the peer review of this

work. Peer reviewer reports are available. PUBLISHER’S NOTE Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

SUPPLEMENTARY INFORMATION SUPPLEMENTARY INFORMATION PEER REVIEW FILE REPORTING SUMMARY RIGHTS AND PERMISSIONS OPEN ACCESS This article is licensed under a Creative Commons Attribution 4.0

International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the

source, provide a link to the Creative Commons license, and indicate if changes were made. The images or other third party material in this article are included in the article’s Creative

Commons license, unless indicated otherwise in a credit line to the material. If material is not included in the article’s Creative Commons license and your intended use is not permitted by

statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this license, visit

http://creativecommons.org/licenses/by/4.0/. Reprints and permissions ABOUT THIS ARTICLE CITE THIS ARTICLE Chéreau, R., Bawa, T., Fodoulian, L. _et al._ Dynamic perceptual feature

selectivity in primary somatosensory cortex upon reversal learning. _Nat Commun_ 11, 3245 (2020). https://doi.org/10.1038/s41467-020-17005-x Download citation * Received: 13 November 2019 *

Accepted: 05 June 2020 * Published: 26 June 2020 * DOI: https://doi.org/10.1038/s41467-020-17005-x SHARE THIS ARTICLE Anyone you share the following link with will be able to read this

content: Get shareable link Sorry, a shareable link is not currently available for this article. Copy to clipboard Provided by the Springer Nature SharedIt content-sharing initiative