Chapter 3: From elementary learning to cognitive control: a neurocomputational perspective

 

3.1  From classical conditioning to cognitive control

 

As was pointed out in Chapter 2, Pfc has been shown to be involved in those tasks that use a delayed-response paradigm and in which performance is linked to the ability of the animal to maintain information over some delay in order to release a response at a later point in time. The model presented here has shown how a field of Pfc neurons can store a given STM representation and at the same time face the presentation of several distractors interleaved between the to-be-stored cue and the release of the action. However this model, as well as most of the models described in the literature, does not take into account how the organism learns to maintain relevant cues in STM, namely how cognitive control develops as an emergent property of a biological system. In particular, the relevant questions are:

1)      How does an animal learn the DAT (or another task in which a STM load is required) in the first place?

2)      How is the DA response calibrated in order to allow robust STM maintenance in Pfc?

3)      How does Pfc select the relevant cues and the relevant responses that are associated with the reward?

In order to address these questions, let’s consider in detail how a DAT is structured, as an example of a rather complex task that involves “cognitive control”, STM, response selection, behavioral inhibition, among others. When a monkey has first to learn a DAT, it engages in an exploratory behavior, which thanks to the intervention of the experimenter is constrained to the apparatus/cues relevant for the task. In the DAT, the monkey has to hold a lever for a given time (5 seconds in the example). At the end of this delay period, a go-signal is turned on (both lights of the panel are turned on), and the monkey should press the opposite panel that has previously selected in the preceding trial (Figure 3.1).

 

 

Figure 3.1. Structure of the delayed alternation task. Successive go-signals (simultaneous apparition of the two circles, indicated by arrows) require alternating between two responses (right and left) separated by a delay of 5 s. In order for the task to be performed, a representation (or “trace”) of the previous response must be preserved in order to perform correctly the next response.

This go-signal is not informative of the required response, since it is identical for the two different responses. The only way the monkey can perform the task correctly isto hold in STM the previous response, and to use this information in order to to select the appropiate panel.

From the ecological point of view, this task is rather complex and incorporates several hard problems. First of all, at the time the reward is delivered after the monkey presses the correct key, several stimuli are present in the environment, and several responses have been emitted by the animal. How does the animal learn the contingency between stimuli, its behavior, and the reward? How does the animal learn that the reward is a function of a) the previous response and b) holding the bar for a given delay?

Once the proper set of cues/responses is found to be in a causal relationship with the reward, how does the animal learn how to use the cue in order to control its behavior and guide it trough the obtainment of the reward? This task is even harder to perform when a delay is interposed between cue, behavioral response and the reward, and other stimuli intervene between the time the cue is presented and the reward is delivered.

As we have seen in the simulation, if Pfc activation is continuously updated by bottom-up (BU) input and buffered from distortion (STM storage) only when DAergic activation occurs, then the problem of how Pfc ever learns a trace of previous cues/responses arises. If, as it was discussed in the previous chapter, Pfc is normally vulnerable to interference unless DAergic gating occurs, then Pfc could only preserve the most recent patterns of activations, namely the one that were present at the time the unconditioned stimulus (US) – or reward – occurred. But in the DAT the significant response (the action of selecting the opposite target) is produced several seconds before the reward is delivered, and many interleaved task irrelevant stimuli/responses can potentially be interposed between the relevant cue/action and the reward. How does the animal learn what is relevant in obtaining the reward? How does the brain solve the paradox of being unable to store Pfc representation unless these are followed by a DAergic response, and at the same time learn contingencies between stimuli that were present (and probably decayed) long before the reward was generated?  

As can be seen, the DAT is rather complex, and involves both classical (cues) and operant (response) conditioning. For the purpose of developing a real-time, ecological model of how cognitive control can develop, we will take into account a simpler task.

 

 

 

 

 

3.2  The linkage between cognitive control and elementary learning

 

            In order to understand more complex forms of conditioning, experimental settings will be taken into account here which incorporate a) a delay between a stimulus, a response and a reward and b) the control of otherwise preponderant response pattern by higher order areas intended to be the neural substrate of “cognitive control”.

Figure 3.2 illustrates a typical experimental setting in which a mixture of operant and classical conditioning paradigm is used. In this hypothetical experiment, a reward is delivered whenever a response (e.g., a lever press) follows the presentation of a cue. In this simple example, the presentation of the cue (CS2) must be followed by a response (RESP1), and reward is delivered if the rat presses the lever within a given, long interval following the onset of the cue. The temporal relationship between cue and response is important, since in this protocol no reward is delivered if the response precedes the cue (RESP1 occurs before CS2).

Figure 3.2. In an ecological setting, as well as in some experimental situations, the animal is exposed to several stimuli (CSn in the figure), and typically emits some sort of response during its normal activity (exploratory behavior, etc..). When a reward is unexpectedly delivered following a given cue (CS, classical conditioning) or a response (operant conditioning) or a mixture of cue/response, the animal has to solve a very difficult task in order to obtain the reward again, namely causally link cue, behavior and reward.

Even in this apparently simple example, several problems have to be “solved” by the animal in order to learn that a given cue, followed by a given response, are the crucial contingencies to be learned. How does the animal discard all other cues and responses, and finally selects the ones which are in causal relationship with the reward? How does the animal learn that the temporal order of cue/response is crucial in order to obtain the reward? Finally, how does the organism prevent the premature release of the action, which will prevent the obtainment of the reward, but release the response after an appropriate interval?

In order to reduce even more the complexity of the “ecological” task shown above, we will consider a variation of trace conditioning, which is the simplest case of classical conditioning in which STM storage of a cue is required. The task used will be actually an hybrid between trace conditioning and operant conditioning, sice a motor output would be required to the animal. Furthermore, this response should happen in a given time window following the CS, therefore requiring and adaptively timed calibration of the motor output.

In trace conditioning, a cue is followed by a US, but only after a delay is interposed between the offset of the cue and the onset of the US. This temporal pattern requires some sort of STM, or internal trace, to develop in order to bridge this temporal gap. Trace conditioning is one of the most well studied conditioning paradigms, some of which are described in Figure 3.3. The strength of conditioning obtained in trace conditioning is usually weaker than the one obtained in delay conditioning, in which CS and US partially overlap in time. We’ll see how even this apparently simple behavioral paradigm requires an unsuspected complicated neural machinery.

Figure 3.3. Conditioning paradigms: delay, trace, simultaneous, and backward conditioning.

3.3 Bridging the temporal gap between representations

 

Let’s imagine a simple system (Figure 3.4) in which we have a “posterior” (input), an “anterior” (control) and a motor (output) fields of neurons, which are intended to mimic a sensory, a prefrontal and a motor cortical area. Let’s imagine that a fourth, subcortical area receives homeostatic signals from the simulated organism, which is a different source of input not related to the environment but rather to internal physiological variables. This is the US pathway of the model, and is active whenever a primary US is delivered (food, shock, etc) and in this first simplified version of the model corresponds to the VTA, the main source of DA for Pfc, which receives afferents from amygdala and hypothalamus, two areas linked to the autonomic nervous system.

 

Figure 3.4. The model. The variables described in the equations are shown in the model. The motor cortex corresponds to the output stage of the model, and an anterior cortex is interposed between the posterior (input) cortex and the output. In this simplified model, the US reached the subcortical stage (VTA) directly from homeostatic receptors, and a signal r (DA) is delivered to Pfc trough modulatory projections. 

 

The field of anterior cortical neurons is the same as the one simulated in the first set of experiments, which is proposed to be the analogous of Pfc, and includes two neural species (excitatory pyramidal and an inhibitory interneurons). As opposed to the first set of simulations, the external input is now modeled by another population of posterior units, which also include two neural species (pyramidal cells and inhibitory interneurons). Anterior units receive projections from the dopaminergic neuron, and are equipped with self-excitatory connections in their pyramidal field. The DA unit, in turn, receives primary reward signals, and projects to the anterior system. In the model, this field of neurons corresponds to the VTA. A more detailed explanation of the neurobiological foundation of the model is given later in this chapter when the final model will be described, but is discarded here in order to highlight how computational, behavioral and theoretical constraints, rather than the need to match biological data, are the main guide in the specification of the characteristic of the model (Swanson, 1982; Oades and Halliday, 1997; Floresco and Grace, 2003). 

The following equations define the activation of posterior (x), anterior (y) and motor (w) pyramidal neurons, posterior (m), anterior (n) and motor (o) inhibitory interneurons neurons, VTA units (r), and the signal function f(h) used in the recurrent portion of the activation. Pyramidal neurons of the posterior cortex (xi) project to the anterior cortex (yj) trough adaptive, modifiable connections zij:    

(9)

 

(10)

 

(11)

 

(8)

 

 

(12)

 

(13)

 

(14)

 

(15)

 

 


where Ii is the bottom-up input to the cell, is the self-excitatory input,  is the recurrent excitatory and inhibitory inputs, A and B and C are the decay rate, the excitatory and the inhibitory saturation point, respectively, and f(h) is the feedback function defined by Equation (7), where h is the argument of the function and F is a constant. In Equation (5), 0 £ DA £ 1.

In the above equations, all terms are like the one used in the first set of simulations, with the exception of the term REWARD, which is 1 only when the reward is delivered, 0 otherwise. The unit r has a leaky-integrator type dynamic (a differential equation wiuth constaint increments and variable leakage) which broadcast r to Pfc neurons, which substitute DA of previous simulations. The adaptive connections from posterior to anterior system adopt the outstar learning rule (Grossberg, 1982), which is basically a variant of hebbian learning with a decay, self normalizing term. The outstar learning rule has been demonstrated to maintain synaptic weights bounded and to converge to a solution in which the pattern of synaptic weights tracks the post-synaptic activation (Grossberg, 1982).

Figure 3.5. The system depicted in figure 3.4 cannot learn a trace-conditioning paradigm. Note that the trace zij between posterior and anterior systems can be reinforced up to a certain extent, but no STM maintenance will survive since the Pfc activity would be already decayed by the time the DA activation is delivered, which would allow Pfc reverberation.

 

The system described by these equations is not able to learn a task in which a sufficiently long delay is interposed between CS and US, as in trace conditioning. In fact, no sensory, motor or prefrontal trace will be available to be paired with the US activation, which in the model enters the system trough the VTA alone. This feature of the model is illustrated in figure 3.5. As can be seen, long delays prevent a temporal overlap between sensory and Pfc trace. Pfc would therefore not learn to store an activation in STM and would not activate the motor output. Furthermore, this system, as it is designed, would not account for the fact that a response, in certain experimental paradigms, must be timed to the US, or is functional to the obtainment/avoidance of the US. A possible way to overcome this problem is to use a STM, recurrent stage, a methodology employed by Grossberg in many models (Grossberg, 1982; Grossberg and Schmajuk, 1989; Grossberg and Merrill, 1992, 1996). This strategy has the same problem that the model presented in the first part of this thesis had, namely the chicken-egg problem of storing STM representation which are not yet being paired with a US, and at the same time postulating that STM storage is a characteristic of learned CSs which are stored in STM.

Another issue with the model of Figure 3.4 is that the system does not allow the motor output to be inhibited from releasing a prepotent, sensory driven response at the time the sensory cue is presented. Again, this basic property can be considered one of the main features of cognitive control. Summarizing, the system of Figure 3.4 is insufficient for explaining the basic target phenomena of trace conditioning. The failure of this model to cope with the target behavior is a justification of the burden of expanding the complexity of the model of several orders of magnitude.

In the following section, a candidate model will be presented in order to cope with a elementary task which involves cognitive control. Before presenting the outline of the model, a fundamental issue should be further investigated, namely how the temporal gap between the CS and the US can be bridged. In particular, the biological candidate structures will be explored.

 

3.4  Synchronizing asynchronous events: the role of hippocampus in learning

 

There is consistent evidence for the involvement of hippocampus in learning and memory in general, and conditioning in particular (for recent reviews, see O'Reilly RC and Norman 2002; Sander, Wiltgen and Fanselow, 2003; Knierim, 2003). Importantly, the involvement of the hippocampus is limited to trace but not delay conditioning, therefore emphasizing the importance of the hippocampus in those experimental paradigms where a STM representation of the stimulus is required (Huerta et al., 2000; McEchron et al. 1998; Anderson and Steinmetz, 1994; Solomon et al., 1986). Lesioning the hippocampus and the amygdala produced memory deficits in the delayed non-matching to sample task in non-human primates (Mishkin, 1978), a task in which cognitive control (selecting the non-preponderant response) and trace conditioning (STM storage of activation) are required.     

The hippocampal pathway begins in the Entorhinal cortex (EC), passes first to the dentate gyrus via the perforant pathway (PP), then along the mossy fibers to area CA3 (Figure 3.6). From CA3, projections to area CA1 via the Schaffer collaterals, then to the subiculum, and finally back out to the EC which forms the majority of connections to and from the cortex. The information that reaches the hippocampus trough perirhinal cortex and EC comes from the highest integrative cortices, namely secondary and associative areas of posterior and anterior neocortex. EC neurons respond to stimuli with highly differentiated, phasic patterns. Direct stimulation of the perforant path (PP) is more effective in CA1 than in CA3. Repeated PP stimulation leads to an increase in the

b

 

a

 

c

 

d

 

a

 

e

 

Figure 3.6 The Hippocampal complex. a) The hippocampus is located in the depth of the temporal cortex (in the figure, a mouse brain is shown) b) Detail of a), with CA3 and CA1 shown. c) The Papez circuit d) Cortical and subcortical structures interested in the Papez circuit e) Detail of hippocampus cell morphology and connectivity.

 

efficacy of electric stimulation, a phenomenon that Vinogradova (Vinogradova, 2001 for a review) named “chronic potentiation” and that has been later renamed LTP. CA3 neurons exert their actions locally in the hippocampus through their Shaffer collaterals, as well as by regulating the activity of diencephalic brain-stem structures, like the the reticular formation (RF) and the Nucleus Accumbens (NAc), trough the lateral septal nucleus relay (LS). CA1 exerts its influence on neocortex trough a circuit that consists of these major stations: CA1 → Subiculum → postcommissural fornix → mammillary bodies → anterior thalamic nucleus → prefrontal and cingulate cortex.

From these gross anatomical considerations, it appears that the information flow in the hippocampus is mainly unidirectional, although we will see how recurrency and, therefore, feedback, is a typical feature of hippocampus. Hippocampal lesions have been extensively studied both in neuropsychological (Squire et al, 2001; Holscher, 2003; Suzuki, 2003) and neurophysiological (see Vinogradova, 2001 for a review) settings. The deficits can be grouped in two main classes:

 

- Deficits in memory: this impairment are selective, involving the consolidation of explicit, declarative, episodic memory. Implicit, procedural and motor memory are usually preserved.

- Deficits in selective attention: unstable attention, highly vulnerable to irrelevant stimulation, but at the same time also rigid, generating difficulties in shifting from one item to the other.

 

The involvement of hippocampus in classical conditioning has been shown in the context of the Nictitating Membrane Response (NMR) in rabbits (Mauk and Thompson, 1987). Rabbits possess a nictitating membrane (a third eyelid) which has been shown being conditionable in a classical conditioning paradigm. In NMR classical conditioning a neutral stimulus (CS), such as a tone, is presented just before an unconditioned stimulus (US), such as a mild puff of air to the eye. After repeated pairings of the CS and the US, the CS elicits a learned or conditioned NMR response (CR) in advance of the US. The two most commonly studied forms of eyeblink conditioning are delay and trace conditioning. In delay conditioning, the CS is presented and remains on until the US is presented with two stimuli overlapping and co-terminating. In trace conditioning, an “empty” (or trace) interval separates the CS and US.

The conditioned eyeblink is an example of an aversively conditioned somatic motor response. The response is a highly specific motor movement that becomes adaptively timed to the presentation of the US. Work with rabbits first demonstrated a clear distinction between delay and trace eyeblink conditioning. The acquisition and retention of delay eyeblink conditioning requires intact cerebellum and associated brainstem structures (Mauk and Thompson, 1987). Like delay conditioning, successful trace eyeblink conditioning requires intact cerebellum (Woodruff-Pak et al. 1985). However, trace conditioning differs from delay conditioning in that it also requires the contribution of hippocampal and neocortical structures. Thus, acquisition and retention of trace conditioning are severely disrupted in rats and rabbits with hippocampal lesions (Moyer et al., 1990; Kim et al., 1995). Notably, trace conditioning in rabbits is disrupted by Pfc lesions (Kim et al., 1995). Another distinctive feature of trace conditioning is that the importance of the hippocampus is time-limited. When hippocampal lesions are made in rabbits 1 day after acquisition, trace conditioning is abolished, whereas lesions made 30 days after acquisition have no effect (Kim et al., 1995).

The hippocampus has been also proposed to be involved in spatial navigation and sequence learning (Linsman, 1999; Nathe, Frank; 2003; Bingman et al., 2003). A strong supporter of the latter argument is Linsman (see Linsman, 1999 for a review). The work by Linsman is important because it is an attempt to discuss issues like spatial navigation, adaptive timing, hetero and auto-associative networks in the light of hippocampal anatomy and physiology.

Linsman does not specifically discuss the involvement of hippocampal in spatial memory, thereby not limiting the breadth of the theory to a single subset of behaviors. The emphasis is on the recall of memory sequences instead of simple “spatial location”, a  position that is more general with respect to the canonical view of hippocampus as a “position detector” (see discussion on place cells, O'Keefe et al, 1998; O'Keefe and Burgess, 1999; Nathe and Frank, 2003; Bingman et al., 2003). The role of the hippocampus is then to store, and recall “sequences”, like spatial position or episodes in a complex situation, and detect a match/mismatch between these predicted sequences and the sensory data.

             

Figure 3.7. Diagram of the main intra-hippocampal wiring. From Linsman, 1999. 

Figure 3.8 (Figure caption from Linsman, 1999, pag 235). The Phase-Advance of Hippocampal Place Cells May Reflect the Recall of Sequences Organized by Theta (5–10 Hz) and Gamma (z40 Hz) Oscillations

(a) A rat moves through a sequence of positions (A–G), causing the firing of a place cell over this entire region. The firing of the G cell occurs with an earlier and earlier phase of theta cycles as the animal moves along this well known path, a phenomenon known as the phase-advance. Successive theta cycles are labeled 1–7. This can be explained (Jensen and Lisman, 1996a) as follows: the G cell represents position G, a region much smaller than the entire place field (A–G), but fires at positions A through F as part of a sequence recall process. This process is initiated at the beginning of each theta cycle by a cue signifying the current position of the animal. The cells encoding this position become active in the first gamma cycle and in turn activate cells encoding the next position in the sequence in the next gamma cycle. This sequence prediction can go on until the last gamma cycle of a theta cycle. As the animal is moving, the cue at each successive theta cycle is further along the path.

(b) Diagram showing how on each theta cycle, the firing of the G cell occurs earlier in the predicted sequence, i.e., at an earlier gamma cycle within a theta cycle.

(c) Illustration of how multiple memory items in a sequence can be active in different gamma cycles (which have different phase relative to a theta cycle). This is what is meant by a phase code. Note that each memory (a place or event) is represented by the subset of cells that fires in the same gamma cycle (yellow indicates firing). Phase coding may occur when the hippocampus is in recall mode (as in [a] and [b]), but also when it is in learning mode. In the latter case, it acts as a “multiplexing buffer,” as follows: a memory item is inserted into the buffer and fires in a given gamma cycle on many successive theta cycles; when the next item is presented, it is also maintained by the buffer, but in a different (later) gamma cycle. The biophysical processes required for a multiplexing buffer are as follows. First, the firing of pyramidal cells activates intrinsic conductances that produce a positive going ramp critical for the reactivation of memories on subsequent theta cycles. Second, rapid feedback inhibition onto pyramidal cells generates 40 Hz oscillations and organizes a winner-take-all process in which only the most excitable cells (encoding the next item in the sequence) fire in a given gamma cycle. Third, a recurrent autoassociational network with weights encoding each item make the cells that encode an item fire as a group, thereby imparting resistance to noise (see simulations of 1–3 in Jensen and Lisman, 1996b, 1996c).

 

The belief of hippocampus as a mere feedforward network involving cerebral cortex - dentate gyrus - CA3 - CA1 – cerebral cortex has been progressively challenged. The first models incorporated the idea that CA3 was an autoassociative network that somehow stored memories for a later retrieval (Marr, 1971). This proposal was based on the observation that CA3 presents a massive recurrency, show LTP, and Hebbian learning. Unfortunately, CA3 is not the only recurrent network in the hippocampus, but also CA1 and the Dentate Gyrus show a strong degree of recurrency. In particular, granule cells (see Figure 3.7) make strong connection on dentate mossy cells, which create a recurrent network by projecting back to the Granule cells. Lisman (1999) is, in his own words, the first to propose a functional role for these two distinct recurrent networks. First of all, Lisman emphasizes that the hippocampus is only involved in episodic memories, i.e. memories that can be formed during a single episode. Lisman suggest that the hippocampus has a somehow coarser, higher level representation of episodes that can then recall more detailed cortical representations. Linsman stresses the fact that the hippocampus is especially important in learning sequences of events. 

            One important observation is that hippocampectomized rats do orient to novel stimuli (completely novel stimuli), but do not orient when the familiar sequence on which they have been trained for is altered (Honey et al., 1998). Secondly, place cells tend to fire during sleep in the same sequence they have been observed firing in the awake state (Skaggs and McNaughton, 1996). A typical physiological feature of place cells is the so-called “phase advance” (O’Keefe and Recce, 1993): the hippocampus of a rat the moves into its environment is characterized by theta frequency oscillations (4-10 Hz) The progressive approach of the rat towards the place field of the cell causes that cell to fire earlier in the theta cycle. The theta cycle is in fact divided into faster gamma cycles, in which the shift of activation is visible (Figure 3.7). This sequence is time compressed, since the theta cycle is obviously happening at a faster rate with respect to the physical movement of the rat trough the environment. hippocampus, in this account, is actually a key “instrument” for predicting environmental events, a feature that constitutes a key evolutionary advantage. 

            How can CA3 store sequences? Lisman (1999) proposes that this property depends on NMDA receptors present at the recurrent synapses of CA3. These channels are implied in LTP, and are the biophysical substrate of the Hebbian learning observed in CA3. An important observation is that NMDA channel activation in CA1 and CA3 leads to LTP even when the post-synaptic activity lags for 100 ms. This observation is interesting and puzzling at the same time: if a given event A is not followed by an event B within a 100 ms gap, Hebbian learning is virtually impossible. Lisman does address this point by commenting that “The mechanism described in the previous paragraph could lead to the encoding of memory sequences in which sequential events have a temporal separation of < 100 ms, but what about the more common situation in which the temporal separation is much larger? The encoding of such sequences may depend on a short term memory buffer that can extend the period of active firing for many seconds. Because hippocampal neurons tend to fire for many seconds after a brief stimulus

 

Figure 3.9 (Figure caption from Linsman 1999, pag 236). Reciprocally Interacting Heteroassociative and Autoassociative Networks Produce More Accurate Sequence Recall than a Single Heteroassociative Network (a) In the simplest heteroassociative network, the cells that encode one memory are selectively connected to the cells that encode the next memory in a sequence. With each successive step in the sequence recall process, the memory becomes more degraded, as indicated by the number of primes. A single network can accurately recall sequences if there is a high degree of correlation between successive memories, but this will not work in the general case. (b) An autoassociative network that stores the associations that constitute each memory item is capable of producing the correct version of any item (e.g., B) when presented with a degraded version (e.g., B).(c) Accurate sequence prediction through the reciprocal interactions of two networks. One network is heteroassociative. When the next item in the sequence is produced, it is sent to the autoassociative network, which is able to correct it. This corrected version is then sent back to the heteroassociative network, where it serves as a basis for the next step in the predictive process. Not enough information is available for a detailed simulation of how this could be carried out by CA3 and dentate networks, but the following is an example of how some of the key problems might be dealt with. A cycle begins when memory A cells of CA3 excite memory B cells of CA3 through recurrent connections, causing single spikes in these cells and pattern B. The spikes are transmitted to the dentate network, where the correct granule cells for the item B are excited (because of direct input from CA3 or indirect input through mossy cells). These “correct” granule cells then fire the “correct” CA3 cells. This causes a burst and initiates the next cycle. If a CA3 cell representing B did not fire because of recurrent input (a false negative), it will fire because of mossy fiber input. A CA3 cell that is a false positive will fire only a single spike (since it will not get mossy fiber input). If only bursts are effectively transmitted to other CA3 cells by the facilitating recurrent synapses (Lisman, 1997), false positives will have little impact. (d) Complexities of sequence storage and recall. First, psychophysical evidence indicates that sequence memory is not strictly a pairwise process between memories n and n-1. The dashed arrow indicates that connections between memories n-2 and n may also contribute (see Jensen and Lisman, 1996c for how a multiplexing buffer makes this possible). Second, studies of human memory (Howard and Kahana, 1998) and nerve network simulations (Levy, 1996) suggest that sequence items can be autoassociated with a preexisting sequence that can be thought of as a sequence of time steps (t1, t2, etc.). Heteroassociation may therefore not be obligatory for sequence learning.

 

(Vinogradova, 1984; Hampson et al., 1993; Colombo and Gross, 1994), the hippocampus must either itself be a buffer or be driven by a network that has buffering ability. Such persistent firing allows a single brief presentation to be synaptically encoded by an LTP-type process that requires repetitive firing to produce synaptic modification.” (Lisman 1999, pag 235)

Linsman observes that phase advance is also observed in the Dentate Gyrus, and this area receives feedback connection from CA3. Linsman proposes that the functional role of the coupled recurrent networks is the following (Figures 3.9 and 3.10). Heteroassociative recurrent networks carry the problem of noise in their prediction. A small perturbation at a given stage in the sequential step can lead to a progressively deteriorating recall of information. Linsman proposes that the Dentate is an autoassociative recurrent network that, given a specific input (feedback) from CA3, reconstruct an undegraded pattern from the one generated by the CA3 “hypothesis” and broadcast it back to CA3. How this fine mechanism could be implemented in CA3 and Dentate is, to me, not clear.

Figure 3.10. (from Linmas, 1999) The Role of Dentate Synapses in Filtering Out Context and the Role of the Perforant Path to CA3 in Transmitting Context (a) At the medial perforant path input to dentate granule cells, contextual information that is steadily firing (horizontal red arrows) is not transmitted because of low-frequency depression. Rapid increases in firing (upward arrows) due to salient information is transmitted. Note that in the dentate, the features Jerry and Sad are represented by the same cell, whereas this is not the case for the cortical input cells. This is what is meant by a change in representation. (b) The same perforant path axons that provide input to the dentate also provide input to CA3. Even constant “contextual” items produce a subthreshold depolarizing bias in CA3. This bias enables a single powerful mossy fiber input (representing event information) to detonate a CA3 cell. In this way, an item is represented in context, even though context itself does not cause firing (as observed). (For altogether different models for encoding context, see Samsonovich and Mc-Naughton, 1997; Minai and Best, 1998.)

 

            What is then the role of the Perforant Path (PP)? Linsman suggests that PP provides both Dentate and CA3 with contextual information that appears to be affected by hippocampal damage (hippocampectomized animals have difficulties in selecting  between different contexts that lead to different rewards). Linsman notes that “there are no cells in the hippocampus that fire continuously in a particular context. One explanation is that contextual input to the hippocampus is itself subthreshold.

Such a subthreshold depolarization could, however, have important consequences in enabling context-appropriate cells to be triggered by other inputs” (Lisman 1999, pag. 237). Linsman proposes that PP information is filtered out in the dentate cells, in such a way that only relevant information is transmitted to CA3. The same PP input excite, always subtreshold, CA3, but this time Mossy fiber from the Dentate can trigger firing of the cells because of the coincidence of Dentate/Mossy fibers. This reasoning is a bit problematic, since it leaves open the problem of how the dentate knows what information is relevant (and thus not to be filtered). Finally, how to relate the autoassociative-heteroassociative role of Dentate-CA3 with this new function of context representation is another important, unresolved issue.

            Projections from CA3 fan out to CA1, a fact that Linsman sees as the signature of a change of representation back to cortical standards, whereas point-to-point connection stands for a relative constant mapping between areas that have a similar representation. Linsman proposes that CA1 and cortex use the same representation, whereas CA3 and dentate use different representations.

            In partial agreement with Vinogradova (Vinogradova, 2001), Linsman proposes that CA1 might compute a match/mismatch between cortical input (trough EC) and prediction originating from CA3. This idea, that maps back to the original proposal by Sokolov (Sokolov, 1963) of a brain that forms a representation of the world based on past events and compares continuously predictions and reality, is incorporated in may models (Grossberg, 1982; Lynch and Ranger, 1992; Hasselmo and Schnell, 1994; Blum and Abbott, 1996; Levy, 1996). Cells in the mammillary body (receiving one of the output pathways from the hippocampus, namely from CA1) fire in exact registration with the expected onset of a repetitive stimulus that has been omitted (Vinogradova, 2001). Other experiments show a habituative response of hippocampus to repetitive stimulation, followed by a dishabituative response when an unexpected stimulus is presented (Vinogradova, 2001). 

In a recent paper, Nakazawa et al. (Nakazawa et al., 2002) have studied the involvement of hippocampal CA3 NMDA receptors in associative memory recall. The paper is consistent with the

Figure 3.11. From (Nakazawa et al., 2002). (A) shows the general organization of the hippocampus and the related Entorhinal cortex. Red arrows show the pathways studied by Nakazawa et al. EC, Entorhinal cortex; DG, dentate gyrus; RC, recurrent collaterals; SC, Schaffer collaterals; MF, mossy fibers; PP, perforant path. Figures B to E show the basic wiring of CA3 and CA1, illustrating the proposed mechanisms for pattern completion. In control (B) and mutant (D), full cue input (downward arrows) is provided to CA3 from DG or EC and to CA1 from EC. In control (C) and mutant (E), a fraction of the original input is provided to activate the memory trace during recall. Red dots, CA3 RC synapses or SC-CA1 synapses participating in memory trace formation; red circles, memory traces that are activated during recall; red dots without red circles, memory trace not activated during recall; red triangles and lines, CA3 pyramidal cell activity resulting from pattern completion through recurrent collateral .ring; green triangles and lines, CA3 pyramidal cell response to external cue information; open triangles and black lines, silent CA3 pyramidal cells and inactive outputs; blue triangles, CA1 pyramidal cells.

 

general view that sees the hippocampus involved in pattern completion. The ability to retrieve complete memories on the basis of incomplete sets of cues is a crucial function of biological memory systems. The authors suggest that pattern completion is mediated principally by the extensive recurrent connectivity of the CA3 area of the hippocampus. The authors have tested this hypothesis by generating and analyzing a genetically engineered mouse strain in which the NMDA receptor gene is ablated selectively in the CA3 pyramidal cells. The mutant mice normally acquired and retrieved spatial reference memory in the Morris water maze, but they were impaired in retrieving this memory when presented with a fraction of the original cues. These results are explained by a qualitative model shown in Figure 3.11. The model emphasizes how CA3, due to its recurrent connectivity, is involved in storing and retrieving relationship between patterns. Damage to CA3 would be evident in those situations in which only a partial version of the pattern is provided. In these situations, the performance of the system relies on the ability of retrieving the whole pattern (in the example, the set of cues) from a partial version.   

Summarizing, there is sufficient evidence pointing to the fact that the hippocampus in involved in learning and memory in general, and conditioning in particular. Furthermore, those tasks in which a temporal gap is introduced are the ones more affected by hippocampal impairment. The following section will review the models of hippocampus which have incorporated the notion of a trace between stimuli which will bridge the gap between temporal disjoint representations.

 

3.5  Models of timing in hippocampus

           

The work of Nakazawa et al. (2002) is a good exemplar of the “stream” of papers proposing some form of relationship between a recurrent network, memory storage, pattern completion, hippocampal architecture, and deficits following hippocampal alterations (Marr, 1971; Gardner-Medwin, 1976; McNaughton and Morris, 1987; Rolls, 1989; Hasselmo et al., 1995).

None of these view, however, emphasizes a pregnant characteristics of the behavioral constraints an animal is facing in an ecological setting, namely that not all cues that should be associated to a given reward or in a given task co-occur in time. This is a crucial observation, and is directly related to the argument discussed in the context of trace conditioning and cognitive control. These models wrongly assume that all cues that should be associated are available at the same time for the associative mechanism in CA3. This is an unjustified assumption, and further mechanism should be invoked to bridge the temporal gap between different cues, whose representations arise and vanish in a continuously varying environment. An autoassociative recurrent or heteroassociative network, as the one depicted in Figure 3.12, can store a pattern trough a hebbian-like LTM mechanism, with the proviso that maximum learning is obtained when the activation patterns co-occur in time. The following question then arises, namely how can two representations which are disjointed in time be ever correlated and mutually reinforced.

Figure 3.12. Diagram representing either an autoassociative recurrent or heteroassociative network, depending on whether the two sets of units represent the same population (autoassociative) or different populations (heteroassociative).

 

The relationship between timing, representations of stimulus traces and the hippocampus has been proposed by several authors (Zipser, 1986; Grossberg and Schmajuk, 1989; Grossberg and Merrill, 1992, 1996). Zipser (1986) proposes that a chain of neurons exists in the hippocampus which acts as a delay line. In Figure 3.13, a conditioned stimulus, CS(t), consists of a short block pulse derived from the onset of the CS and injected into the delay line. CS(t) then slowly propagates down the delay line, activating each neuron after a 50 ms delay.

Figure 3.13. Basic structure of the hippocampal delay line model of adaptive timing as proposed by Zipser (1986).

 

In Figure 3.13, NM(t) is the nictitating membrane response produced by the US, which acts as teaching signal. The goal of learning in the model is for the hippocampal response to match NM(t). Learning of the pathways described in Figure 3.13 is obtained by adjustment of synaptic weights through the following difference equation:

Rewriting this as a differential equation, this learning law can be seen to be of the form of outstar learning (Grossberg, 1969a).

In these equations, LTM is gated by the CS at a given delay along the delay line, namely the CS representation (t-d) time steps ago. An active representation at a give delay is then multiplied with the US