Architecture between verbal and visual

Free download. Book file PDF easily for everyone and every device. You can download and read online Architecture between verbal and visual file PDF Book only if you are registered here. And also you can download or read online all Book PDF file that related with Architecture between verbal and visual book. Happy reading Architecture between verbal and visual Bookeveryone. Download file Free Book PDF Architecture between verbal and visual at Complete PDF Library. This Book have some digital formats such us :paperbook, ebook, kindle, epub, fb2 and another formats. Here is The CompletePDF Book Library. It's free to register here to get Book file PDF Architecture between verbal and visual Pocket Guide.

Multistage robot control architecture. A Joint hierarchical model of action execution and action observation. The goal inference capacity is based on motor simulation layer ASL. Efficient action coordination between individuals in cooperative tasks requires that each individual is able to anticipate goals and motor intentions underlying the partner's unfolding behavior. As discussed in the introduction, most MNs represent actions on an abstract level sensitive to goals and intentions. For a human—robot team this is of particular importance since it allows us to exploit the motor resonance mechanism across teammates with very different embodiment.

In the following we briefly describe the main functionalities of the layered control architecture for joint action. It is implemented as a distributed network of DNFs representing different reciprocally connected neural populations. In their activation patterns the pools encode action means, action goals and intentions or their associated perceptual states , contextual cues and shared task information c.

In the joint construction task the robot has first to realize which target object the user intends to build. When observing the user reaching toward a particular piece, the automatic simulation of a reach-to-grasp action allows the robot to predict future perceptual states linked to the reaching act. In case that there is a one-to-one match, the respective representation of the target object becomes fully activated.

Otherwise the robot may ask for clarification Are you going to assemble object A or object B? Once the team has agreed on a specific target object, the alignment of goals and associated goal-directed actions between the teammates have to be controlled during joint task execution.

This circuit supports a direct and automatic imitation of the observed action. Importantly for joint action, however, the model allows also for a flexible perception—action coupling by exploiting the existence of action chains in the middle layer PF that are linked to goal representations in prefrontal cortex. The automatic activation of a particular chain during action observation e. Consistent with this model prediction, a specific class of MNs has been reported in F5 for which the effective observed and effective executed actions are logically related e.

For the robotics work we refer to the three layers of the matching system as the action observation AOL , action simulation ASL and action execution layer AEL , respectively. The integration of verbal communication in the architecture is represented by the fact that the internal simulation process in ASL may not only be activated by observed object-directed actions but also by action related speech input.

Moreover, the set of complementary behaviors represented in AEL consists of goal-directed action sequences like holding out an object for the user but also contains communicative gestures e. For an efficient team behavior, the selection of the most adequate complementary action should take into account not only the inferred goal of the partner represented in GL but also the working memory about the location of relevant parts in the separate working areas of the teammates represented in OML , and shared knowledge about the sequential execution of the assembly task represented in STKL.

To guarantee proactive behavior of the robot, layer STKL is organized in two connected DNFs with representation of all relevant parts for the assembly work. Feedback from the vision system about the state of the construction and the observed or predicted current goal of the user will activate the population encoding the respective part in the first layer. Through synaptic links this activation pattern automatically drives the representations of one or more future components as possible goals in the second layer.

Based on this information and in anticipation of the user's future needs the robot may already prepare the transfer of a part that is currently in its workspace. In line with the reported findings in cognitive neuroscience the dynamic field architecture stresses that the perception of a co-actor's action may immediately and effortlessly guide behavior.

However, even in familiar joint action tasks there are situations that require some level of cognitive control to override prepotent responses. For instance, even if the user would directly request verbally or by pointing a valid part located in the robot's workspace, the robot should not automatically start a handing over procedure. The user may have for instance overlooked that he has an identical object in his own working area.

In this case, a more efficient complementary behavior for the team performance would be to use a pointing gesture to attract the user's attention to this fact. Different populations in the action monitoring layer AML are sensitive to a mismatch on the goal level e. In the example, input from OML representing the part in the user's workspace and from ASL representing the simulated action means activate a specific neural population in AML that is in turn directly connected to the motor representation in AEL controlling the pointing gesture.

As a result, two possible complementary actions, handing over and pointing, compete for expression in overt behavior. Normally, the pointing population has a computational advantage since the neural representations in AML evolve with a slightly faster time scale compared to the representations driving the handing over population. In the next section we explain in some more detail the mechanisms underlying decision making in DNFs.

It is important to stress that the direct link between action monitoring and action execution avoids the problem of a coordination of reactive and deliberative components that in hybrid control architectures for HRI typically requires an intermediate layer e. Dynamic neural fields provide a theoretical framework to endow artificial agents with cognitive capacities like memory, decision making or prediction based on sub-symbolic dynamic representations that are consistent with fundamental principles of cortical information processing.

The basic units in DNF-models are local neural populations with strong recurrent interactions that cause non-trivial dynamic behavior of the population activity. Most importantly, population activity which is initiated by time-dependent external signals may become self-sustained in the absence of any external input. Such attractor states of the population dynamics are thought to be essential for organizing goal-directed behavior in complex dynamic situations since they allow the nervous system to compensate for temporally missing sensory information or to anticipate future environmental inputs.

The DNF-architecture for joint action thus constitutes a complex dynamical system in which activation patterns of neural populations in the various layers appear and disappear continuously in time as a consequence of input from connected populations and sources external to the network e. For the modeling we employed a particular form of a DNF first analyzed by Amari In each model layer i , the activity u i x , t at time t of a neuron at field location x is described by the following integro-differential equation for mathematical details see Erlhagen and Bicho, :.

The integral term describes the intra-field interactions which are chosen of lateral-inhibition type:. Only sufficiently activated neurons contribute to interaction. The model parameters are adjusted to guarantee that the field dynamics is bi-stable Amari, , that is, the attractor state of a self-stabilized activation pattern coexists with a stable homogenous activation distribution that represents the absence of specific information resting level.

If the summed input, S i x , t , to a local population is sufficiently strong, the homogeneous state loses stability and a localized pattern in the dynamic field evolves. Weaker external signals lead to a subthreshold, input-driven activation pattern in which the contribution of the interactions is negligible. This preshaping by weak input brings populations closer to the threshold for triggering the self-sustaining interactions and thus biases the decision processes linked to behavior. Much like prior distributions in the Bayesian sense, multi-modal patterns of subthreshold activation may for instance model user preferences e.

The existence of self-stabilized activation pattern allows us to implement a working memory function. Since multiple potential goals may exist and should be represented at the same time and all relevant components for the construction have to be memorized simultaneously, the field dynamics in the respective layers STKL and ML must support multi-peak solutions.

Their existence can be ensured by choosing weight functions Eq. The principle of lateral inhibition can be exploited on the other hand to force and stabilize decisions whenever multiple hypothesis about the user's goal ASL, GL or adequate complementary actions AEL are supported by sensory or other evidence. The inhibitory interaction causes the suppression of activity below resting level in competing neural pools whenever a certain subpopulation becomes activated above threshold. The parameter k scales the total input to a certain population relative to the threshold for triggering a self-sustained pattern.

This guarantees that the inter-field couplings are weak compared to the recurrent interactions that dominate the field dynamics for details see Erlhagen and Bicho, The scaling also ensures that missing or delayed input from one or more connected populations will lead to a subthreshold activity distribution only. The input from each connected field u l is modeled by Gaussian functions:.

Inputs from external sources speech, vision are also modeled as Gaussians for simplicity. In the following we discuss results of real-time human—robot interactions in the joint construction scenario. The snapshots of video sequences shall illustrate the processing mechanisms underlying the robot's capacity to anticipate the user's need and to deal with unexpected events. Details on the connection scheme for the neural pools in the layered architecture and numerical values for the DNF parameters and inter-field synaptic weights may be found in the Supplementary Material.

Toy object L-shape. A Pieces used to build the L-shape. B Different serial orders to assemble the L-shape. The initial communication between the teammates that lead to the alignment of their intentions and plans is included in the videos. The plan describing how and in which serial order to assemble the different components is given to the user at the beginning of the trials. For both layers, the total input top and the field activation bottom are compared for the whole duration of the joint assembly work.

Since the robot does not perform assembly steps itself, AEL only contains two types of overt motor behavior: pointing towards a specific component in the user's workspace or grasping a piece for holding it out for the user. First example: 1 goal inference when gesture and speech contain incongruent information ASL , and 2 anticipatory action selection AEL. A Video snapshots. Second example: faster goal inference and speeded decision making due to congruent information from gesture and speech. Third example: Error detection and correction.

Derniers numéros

It is important to stress that the dynamic decision making process in AEL also works in more complex situations with a larger number of possible complementary action sequences linked to each component Erlhagen and Bicho, The fact that the user simultaneously points towards a short slat creates a conflict that is represented in the bi-modal input pattern to ASL centered over A6 and A7 at time T0.

It represents a simulated pointing act towards the short slat. The decision is the result of a slight difference in input strength which favors communicative gestures over verbal statements. This bias can be seen as reflecting an interaction history with different users. Our human—robot experiments revealed that naive users are usually better in pointing than verbally referring to unfamiliar objects. The robot directly communicates the inferred goal to the user S2.

However, since the total input from connected layers is stronger for alternative A1, the robot decides to hand over the short slat S3. Subsequently, the robot interprets the user's request gesture empty hand, S4 as demanding a medium slat S5. The observed unspecific gesture activates to some extent all motor representations in ASL linked to components of the L-shape in the robot's workspace compare the input layer.

Goal inference is nevertheless possible due to the input from STKL that contains populations encoding the sequential order of task execution. At time T2 the robot observes the human reaching towards an orange nut S7. Since according to the plan the nut is followed by a yellow bolt and the bolt is in its workspace, the robot immediately starts to prepare the handing over procedure and communicates the anticipated need to the user S8—S9.

Note that the activation patterns representing the inferred current goal of the user A4 in ASL and the complementary action A3 in AEL evolve nearly simultaneously in time. An additional observation is worth mentioning. The input supporting the complementary behavior A3 starts to increase shortly after the decision to hand over the medium slat, that is, well ahead of the time when the robot predicts the nut as the user's next goal. This early preparation reflects the fact that handing over the medium slat automatically activates the representations of all possible future goals in STKL that are compatible with stored sequential orders.

Since a yellow bolt and an orange nut represent both possible next assembly steps, the combined input from STKL and OML bolt in robot's workspace explains this early onset of subthreshold motor preparation in AEL. However, this time the meaning of the verbal request and the pointing act are congruent. Consequently, the input converges on the motor representation in ASL representing the pointing A6 and a suprathreshold activity pattern quickly evolves.

This in turn activates the population encoding the complementary behavior of handing over the short slat in AEL. Note that in both cases the alternative complementary behavior representing the transfer of a medium slat A3 appears to be activated below threshold at time T0. This pre-activation is caused by the input from STKL that supports both the short and the medium slat as possible goals at the beginning of the assembly work. The robot observes a reaching towards the short slat S1 and communicates to the user that it infers the short slat as the user's goal S2.

However, this pattern does not become suprathreshold since at time T1 the user request the yellow bolt in the robot's workspace S3. By internally simulating a pointing gesture the robot understands the request S4 which in turn causes an activity burst of the population in AEL representing the corresponding complementary behavior A3. However, also this pattern does not reach the decision level due to inhibitory input from a population in the AML.

This population integrates the conflicting information from STKL possible goals and the input from the action simulation yellow bolt. The robot informs the user about the sequence error S5 and suggests the correction by pointing towards the medium slat and speaking to the user S6. The user reacts by reaching towards the correct piece S7. The internal simulation of this action triggers the updating of the goals in STKL which allows the robot to anticipate what component the user will need next.

As shown by the suprathreshold activation pattern of population A3 in AEL, the robot immediately prepares the transfer of the yellow bolt S8—S9. Third example: initial distribution of components in the two working areas. The main aim of the present study was to experimentally test the hypothesis that shared circuits for the processing of perception, action and action-related language may lead to more efficient and natural human—robot interaction. Humans are remarkably skilled in coordinating their own behavior with the behavior of others to achieve common goals.

In known tasks, fluent action coordination and alignment of goals may occur in the absence of a full-blown human conscious awareness Hassin et al. The proposed DNF-architecture for HRI is deeply inspired by converging evidence from a large number of cognitive and neurophysiological studies suggesting an automatic but highly context-sensitive mapping from observed on to-be-executed actions as underlying mechanism Sebanz et al. Our low-level sensorimotor approach is in contrast with most HRI research that employ symbolic manipulation and high-level planning techniques e.

At first glance, the motor resonance mechanism for nonverbal communication seems to be incompatible with the classical view of language as an intentional exchange of symbolic, amodal information between sender and receiver. However, assuming that like the gestural description of another person's action also a verbal description of that action has direct access to the same sensorimotor circuits allows one to bridge the two domains.

In the robot ARoS, a verbal command like Give me the short slat first activates the representation of a corresponding motor act in ASL e. We have introduced this direct language—action link into the control architecture not only to ground the understanding of simple commands or actions in sensorimotor experience but also to allow the robot to transmit information about its cognitive skills to the user. Our approach to more natural HRI differs not only on the level of the control architecture from more traditional approaches but also on the level of the theoretical framework used.

Compared with for instance probabilistic models of cognition that have been employed in the past in similar joint construction tasks Cuijpers et al. As all activity patterns in the interconnected network of neural populations evolve continuously in time with a proper time scale, a change in the time course of population activity in any layer may cause a change in the robot's behavior. For instance, converging input from vision and speech will speed up decision processes in ASL and AEL compared to the situation when only one input signal is available.

We are currently exploring adaptation mechanisms of model parameters that will allow the robot to adapt to the preferences of different users. A simple change in input strength from STKL to AEL will affect for instance whether the robot will wait for the user's explicit commands or will act in anticipation of the user's needs.

Learning and adaptation has not been a topic of the present study for which all inter-field connections were hand-coded. It is important to stress, however, that the DNF-approach is highly compatible with a Hebbian perspective on how social cognition may evolve Keysers and Perrett, In our previous work we have applied a competitive, correlation-based learning rule to explain for instance how intention-related action chains may evolve during learning and practice Erlhagen et al. The interaction of the field and learning dynamics causes the emergence of new grasping populations that are linked to specific perceptual outcomes e.

Evidence from learning studies also support the plausibility of the direct action—language link implemented in the control architecture. Several groups have applied and tested in robots different neural network models to explain the evolution of neural representations that serve the dual role of processing action-related linguistic phrases and controlling the executing of these actions Billard, ; Cangelosi, ; Wermter et al.

The Visual and the Verbal Symposium, University of Brighton

The results show that not only simple word—action pairs may evolve but also simple forms of syntax. A promising learning technique seems to be a covert or overt imitation of a teacher who is simultaneously providing the linguistic description. The tight coupling between learner and teacher helps to reduce the temporal uncertainty of the associations Billard, The role of brain mechanisms that have been originally evolved for sensorimotor integration in the development of a human language faculty remains to a large extent unexplored Arbib, We believe that combining concepts from dynamical systems theory and the idea of embodied communication constitutes a very promising line of research towards more natural and efficient HRI.

The authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest. National Center for Biotechnology Information , U. Journal List Front Neurorobotics v. Front Neurorobotics. Published online May Prepublished online Jan Author information Article notes Copyright and License information Disclaimer.

Received Dec 1; Accepted Apr This is an open-access article subject to an exclusive license agreement between the authors and the Frontiers Research Foundation, which permits unrestricted use, distribution, and reproduction in any medium, provided the original authors and source are credited. This article has been cited by other articles in PMC. Abstract How do humans coordinate their intentions, goals and motor behaviors when performing joint action tasks? Keywords: joint action, neural fields, goal inference, natural communication, mirror system. Introduction New generations of robotic systems are starting to share the same workspace with humans.

Joint Construction Task For the human—robot experiments we modified a joint construction scenario introduced in our previous work Bicho et al. Open in a separate window. Figure 1. Further studies have attempted to identify brain regions related to either verbal or non-verbal WM. Using a 2-back task, Ikeda and Osaka [ 17 ] investigated memory for colours that could be coded either verbally or visually.

Analysis of the results from the condition where colours could be coded verbally revealed activity in areas associated with the phonological loop , such as inferior frontal gyrus and inferior parietal lobule.

  • A Funny Thing Happened on My Way to the World: Diary of a Fearless Travel Agent.
  • Success In Internet Marketing.
  • Contact us.
  • Verbal-visual intertextuality: how do multisemiotic texts dialogue?;

The non-verbal coding of colours resulted in right inferior frontal gyrus activity, an area that has been associated with the visuo-spatial sketchpad of WM. These results stand in contrast to the results of the review article by Cabeza and Nyberg [ 7 ] of more than 60 visual WM studies. These authors concluded that there is little evidence for a dissociation of verbal and non-verbal WM in the human cortex.

This could be explained by the observation that most paradigms allow for the verbal encoding of visual material. Although Ikeda and Osaka [ 32 ] revealed a possible dissociation between verbal and non-verbal WM -associated brain areas, their non-verbal stimuli may also have been coded verbally by the subjects.

The words "lighter" or "darker" may have been used by the subjects for the intended non-verbal stimuli that all stemmed from one color category. Although their study revealed differing brain activity between verbal and non-verbal conditions this does not imply that this was due to their subjects' coding approaches. Mere differences in the visual appearance of the stimuli could also, at least in part, account for their results. The effect of memory delay length on cortical activation has received less attention.

In the visual WM studies reviewed above, inter-stimulus intervals ISIs in delayed discrimination paradigms varied from ms [ 29 ] to 24 s [ 30 ]. Barch et al. In their verbal WM task the retention interval was either 1 or 8 s. The task used was a variant of the Continuous Performance Test [ 34 ]. Subjects had to press a button whenever the letter X followed the letter A. Differences in task demands ranging from simple delayed-discrimination to demanding n-back tasks may also underlie the differences in brain activation. The present study attempts to account for some of the inconsistencies in visual WM studies by systematically varying both delay length and coding strategies in the discrimination of simple grating stimuli.

We used Gabor stimuli of differing orientation and instructed subjects to explicitly encode the relative orientations using a verbal code. The results from this condition were compared to those arising from a condition, where verbal coding could not be readily employed. We believe that we were able to create a paradigm in which non-verbal stimuli were virtually identical to the verbal stimuli but which could not be coded verbally as may have taken place in previous WM studies.

Our findings suggest that the coding strategy used by the subjects has a profound effect on the pattern of brain activation exhibited during the delayed discrimination of similar stimuli. These differences are most pronounced for the long delay, where verbal stimuli seem to engage predominantly left-hemispheric temporo-parietal areas, whereas non-verbal memory is associated with medial and right-hemispheric frontal brain activity. All participants gave their written informed consent.

All had normal or corrected-to-normal vision and reported no prior psychiatric or neurological impairments. In the experiment the participants had to decide whether two Gabor stimuli, which were presented sequentially and separated by a delay period, had the same or a different orientation. The inter-stimulus interval ISI between the reference and the test stimulus was either 2 or 8 seconds. Gabor pairs were constructed so that they could be coded either verbally or non-verbally. This was done so that subjects could verbally code these orientations with the words "left" and "right", as it had been suggested to them in the instruction.

Gabors were constructed in this manner so that they could not be easily coded in a verbal manner i. An example of a reference Gabor stimulus with its corresponding test stimulus for the verbal condition, in which the participants were instructed to memorize the orientation with a sub-vocal verbal rehearsal strategy e. The example depicts stimuli on a trial in which the test and the reference grating differed. An example of a stimulus pair for the nonverbal conditions, in which the instructions emphasized the use of visual encoding. Here the stimuli are taken from a trial in which the test and the reference stimulus differed.

Trials were presented in random order and subjects were instructed to maintain central fixation throughout the experiment. At the beginning of each trial, a red or green bar appeared for ms in the centre of fixation. A red bar signified that a non-verbal stimulus pair was coming up, while a green bar stood for a verbally codable stimulus pair. The bar was either short or long. Subjects were cued in this way on each trial to optimize their respective coding strategies.

This cue was followed for ms by a black fixation point in the centre of the screen. Then the reference grating appeared for ms in either the lower left or the upper right quadrant of the screen, with the fixation point still remaining in the centre of the screen. Gabors were presented in the periphery see below.

During the following ISI either or ms , only the fixation point appeared on the screen. After this the test grating appeared in the same quadrant as the reference Gabor for ms. Subjects then had to press a button with the index finger of their right hand if they thought that the test and the reference grating had the same orientation.


Another button was pressed with the middle finger of the right hand if they thought that the two orientations differed. Participants had been instructed to respond as quickly and as accurately as possible. Schematic depiction of a trial from the verbal ISI 2s condition in which the test and the reference stimulus differed. Trials started with a bar that informed subjects about delay length short bar: 2 s, long bar: 8 s and type of stimulus pair green bar: verbal, red bar: non-verbal.

After this a fixation point appeared that remained in the centre of the screen for the rest of the trial. This was followed by the reference Gabor that was shown in either the upper right or lower left quadrant of the screen here a trial with stimuli in the lower left quadrant are presented. The delay interval was presented afterwards 2 or 8 s , followed by the test Gabor that appeared in the same quadrant as the previous reference Gabor. During the following interval, the subject had to judge if the test and the reference stimulus had the same or a different orientation and press the corresponding button.

In the fMRI experiment, each subject participated in one session that consisted of a total of trials. At the end of the session, subjects were asked if and how often they had used verbal coding strategies in both the verbal and the non-verbal conditions. Stimuli were created with Matlab 6. The screen size subtended Gabor stimuli had a diameter of approximately 6. The contrast of the Gabors was tapered with a Gaussian kernel Gauss constant: 1.

Subjects responded by pressing the buttons of a Lumitouch Photon Control, Burnaby, Canada optical response device with their index finger and the middle finger of their right hand. Reaction time RT and accuracy data were recorded and stored for offline analysis. Blood-oxygen-level-dependant imaging data were acquired with a 3-Tesla Siemens Allegra head scanner Siemens Inc.

The scanner acquired echo-planar-imaging EPI sequences using fast gradients. A standard one-channel head coil was used.

Architecture and Public Debate in Modern Europe – an OCCAS Research Project

Time-to-repeat TR was ms. Trials in the experimental paradigm were synchronized with scanner pulses. In every experimental session, scans were acquired. In order to obtain a better estimate of the actual hemodynamic response function hrf a jitter was implemented during the acquisition of functional images. Therefore on half of the trials in the experimental paradigm the trial onset was shifted by a fixed amount of time. A ms fixation period was added at the beginning and at the end of each respective trial, thus shifting events in the jittered trials by ms.

Functional data were slice timed and realigned. The voxel sizes of the written normalised images were 1 mm 3.

Statistical evaluation consisted of modeling the onset times of the test Gabor-stimuli as events on individual first level. These onsets were modeled separately for each of the 4 conditions if the correct response was given. Another two regressors for incorrect responses after an ISI of 2 or 8 seconds, respectively, were also included amounting to a total of 7 regressors including constant for each individual analysis.

Interesting effects were contrasted using T-statistics, generating the relevant contrast images for second level evaluation. For the random-effects group level statistics, T-value maps were calculated with appropriate contrast images.

Activation vs. Thresholds were adjusted for differential contrasts as we expected only small differences of effect sizes. To visualize the results, the activations were overlaid on a normalized rendered image from one of the subjects. The computation of each individual's performance revealed that all participants were able to discriminate the relevant stimuli reasonably well. Mean accuracy proportion of correct responses for the four conditions was as follows: verbal, 2s ISI: 0.

Accuracy was correspondingly higher for the verbal conditions.

Thus, RTs in the verbal conditions were significantly lower than in the non-verbal conditions. Also, RTs in the long retention 8 s conditions were significantly longer when compared to the short retention 2 s conditions, in agreement with earlier psychophysical results [ 36 ]. Mean reaction times are presented for the non-verbal and the verbal conditions. Performance portion correct responses in the verbal and the non-verbal 2 and 8s ISI conditions. These subjects also claimed to have aborted the strategy soon after the onset of the experiment because they had felt that it was not successful.

The different results for verbal versus non-verbal trials may therefore be regarded as a consequence of the participants' coding strategies. All participants claimed to have used the words "left" and "right" of vertical for the verbal coding trials in covert speech. The hemisphere, anatomical region, corresponding Brodmann area number, the MNI location, as well as the magnitude and size of the activated cluster are given for each of the four conditions. The patterns of activation indicate that the brain activity resulting from the verbal and non-verbal conditions are widely spread across prefrontal, cingulate, parietal, temporal and occipital regions in both hemispheres.

Files in this item

Brain areas showing significant activation. The Montreal Neurological Institute MNI coordinates of the most active voxel is given for each cluster, along with the z-value of the magnitude of activation and the number of voxels contained within the cluster in parentheses. For our purposes, we focus on the comparison of activation across the different experimental conditions. This lack of difference could be related to the temporal overlap of the BOLD response to the perceptual encoding and retrieval events in the non-verbal condition.

For abbreviations see Table 1. Significantly more BOLD-dependent activity was found in left SMG, posterior cingulate, right cingulate gyrus, and the right precentral lobule for this contrast. Results from the random-effects group-analysis. This study investigated differences in cortical BOLD activity for a verbal and non-verbal delayed-discrimination WM paradigm for short and long retention intervals. The paradigm used here, a delayed orientation discrimination task, focused on the maintenance of visual memory representations without any manipulation process. In the verbal encoding condition, Gabor patches were oriented slightly to the left or to the right of vertical so that subjects could covertly use the terms "left" and "right" as verbal cues.

The "non-verbal stimuli" were oriented to the left only and could not be readily related to the vertical or horizontal axes. Differences in orientation angle between the reference and test gratings, however, were the same for both encoding conditions. We believe that subjects coded verbal and non-verbal stimulus pairs with a verbal coding strategy in one case and refrained from doing so in the latter instance. Firstly, subjects were explicitly told in the instruction to code verbal stimuli with the words "left" and "right".

Secondly, non-verbal stimuli were constructed in a fashion that would not lend themselves to verbal coding. Orientations were selected that were not near prominent positions of an analogue clock face and stimuli were presented for ms only. Verbal stimuli were oriented to the left or to the right of the vertical plane, thus inevitably yielding the verbal codes "left" and "right".

The Visual and the Verbal Symposium, University of Brighton | The Printed and the Built

Although usually considered an unreliable measure of experimental control, subject debriefings conducted in our experiment confirmed that subjects had used verbal coding in the verbal condition, and refrained from doing so in the nonverbal condition, as intended. We believe that the stimuli used in this study represent a novel approach in the investigation of verbal and non-verbal WM. Due to the virtually identical visual appearance of the verbal and the non-verbal stimuli, differences in brain activity in this experiment can be attributed entirely to the coding strategies applied by the subjects.

Indeed, the trial-by-trial cues instructed the subjects to apply the appropriate strategies to the individual trial types. This manipulation may not have been properly achieved in previous studies. The behavioural data revealed slower reaction times and lower accuracies for the non-verbal conditions as opposed to the verbal conditions, suggesting the use of different neural mechanisms. Non-verbal WM is typically associated with the engagement of the visuospatial sketchpad component of WM , whereas verbal WM additionally engages the phonological loop component. It has frequently been reported in previous studies that verbal coding, as opposed to non-verbal WM , enhances WM performance, a finding that is reflected in this study's behavioural results.

Accuracies and reaction times differed between the verbal and non-verbal conditions Fig. It could be argued that we should have adapted the stimulus differences in angle between stimulus pairs or presentation time to yield equivalent performance for the two trial types. By doing this, however, differences in brain activity could not have been attributed to underlying coding strategies used by the subjects but would have to be explained in terms of differing visual stimulus properties.

Such a procedure i. We believe that, although accuracies differed between verbal and non-verbal trials, the results may be interpreted as a result subjects' coding strategies and not to differing stimulus properties, a major problem in previous WM studies. The functional imaging results presented here reflect maintenance processes dependent on both delay period and coding strategy applied. Since a simple delayed-discrimination WM paradigm was used here, it does not reflect manipulation processes that are usually captured in n-back tasks and that are thus hard to disentangle from maintenance processes [ 7 - 9 ].

The main focus of this study, however, was on the dissociation between verbal and non-verbal WM at different delay lengths. Therefore we will not discuss these results in detail, but rather focus on the direct comparisons of verbal and non-verbal conditions. The differential analysis between the verbal and non-verbal conditions revealed differing activity for the comparisons between the conditions with the same delay duration. In the short retention interval, significantly more activity was detected in bilateral areas close to well-known language areas, such as the supramarginal gyrus, superior temporal gyrus, and inferior frontal gyrus, with preponderance in the left hemisphere.

No additional activity was found when contrasting the short non-verbal to the short verbal condition. In the long interval, however, the non-verbal condition showed more activity in right DLPFC and medial frontal areas than the verbal condition. In the verbal long-retention condition more activity could be measured in left language associated areas such as supramarginal gyrus, superior temporal gyrus, as well as in medial parietal areas when compared to that found in the long non-verbal condition.