MEi:CogSci Conferences, MEi:CogSci Conference 2010, Dubrovnik

Font Size: 
Relationship between conceptual and linguistic systems: toward a connectionist modelling approach
Peter Fugger

Last modified: 2010-06-11

Abstract


This thesis is related to embodied and symbolic approaches to the representation of meaning. Language is a structured symbolic representation system linked to the content. On the other hand we have concepts which are related to our experience in the world. Embodied approaches claim that language comprehension requires re-activation of our representations built from experiences with the world and that there are no amodal symbols (i.e. symbolic representations independent of any sensory modality) and that our concepts are perceptually grounded. Symbolic approaches claim that language comprehension relies on interdependencies of words, and that symbols are amodal, thus non-perceptual. What are the differences between comprehension of abstract and concrete concepts? The starting point is the view that two different systems are engaged and interacting when we comprehend language. One is the linguistic system (LS), the other is the conceptual system (CS). Different opinions are stated about the activation and contribution of these two systems during language comprehension - as it seems also to depend on the level of abstractness of the involved concepts. Accordingly, these opinions are represented by different theories; three of them shall serve as theoretical background for our intended connectionist model.

Theoretical background
Linguistic and Simulation Systems (LASS) theory (Barsalou et al., 2008) of language understanding assumes that there is a distinction between knowledge that is purely linguistic and knowledge that is grounded in the brain's modal-specific systems. The LASS approach makes a principled distinction between linguistic versus conceptual representations, where each of them resides in two distinct systems, LS and a simulation system (CS). Simulation means that semantic content is achieved by re-activating, usually in weaker form, the same sensory-motor system which is used when the referent of a word or sentence is actually experienced. The LASS framework takes into account the evidence related to linguistic processing, situated simulation, mixtures and interactions between language and situated simulation, and statistical underpinnings of language and situated simulation. Situated simulation and hence conceptual structure relates to rich aspects of perceptual and subjective experience. When language is perceived then representations of linguistic forms are more similar to the perceived words than simulations of experience are. Therefore representations of linguistic forms are activated first. Upon that pointers to the related conceptual informations are generated. Linguistic strategies are regarded as relatively superficial: if the retrieval of linguistic forms and associated statistical information is sufficient for adequate performance, then no retrieval of deeper conceptual information is necessary. According to LASS, LS does not refer much to meaning; meaning is rather instead represented in the simulation system.

Symbol interdependency hypothesis (SIH) (Louwerse & Jeuniaux, 2008) tries to combine the embodied and the symbolic approach. Language comprehension is claimed to be both embodied and symbolic. Language is partly grounded but symbols also derive meaning from other symbols because they are linked/interdependent to each other. The evidence for embodied representations is drawn from experiments showing that motor actions interact with comprehension. Eye-tracking studies show interaction between eye gaze and text comprehension. Conversation is multi-modal: eye gaze, facial expression, gesture and language interact. On the other hand, evidence for symbolic representations comes from statistical computational models such as LSA (Latent Semantic Analysis). It computes semantic relatedness by the frequency of word co-occurrences in large text corpora. This works upon the view that semantic relatedness or semantic similarity is given if the words occur in the same context. The output of these models correlates with human performance.
According to SIH, symbols are interdependent to objects AND to each other, therefore it is not always necessary for comprehension that every detail of the embodied representations has to be activated, but instead linguistic information can be used. Because of efficiency reasons in language processing a continuous systematic grounding is not yielded. Language is accessed symbolically, unless embodied representations are cued by the task. There is a very strong claim made by SIH: Embodied relations tend to be encoded in language structure, hence the structures found in the physical world are reflected in language structures.
Theory of Lexical Concepts and Cognitive Models (LCCM) (Evans, 2006) takes a somewhat intermediate position regarding meaning representation. It predicts that on LS side 'lexical concepts' represent linguistically encoded 'packages' of information which are conventionally associated with a particular phonological (and/or written) form. Lexical concepts also carry encoded information by combinations with other lexical concepts in an utterance. The lexical concepts provide access to a vast body of encyclopaedic knowledge in the CS. Each speaker can potentially activate this knowledge which is termed a lexical concept's semantic potential and is a collection of structured cognitive models. The semantic representation in LCCM is spread over both systems, LS and CS; it comprises cognitive models (CS) as well as lexical concepts (LS).

Connectionist modelling
The intended connectionist model is motivated by the above mentioned theories that highlight the grounding of knowledge and also point to the relationship between the conceptual knowledge and linguistic knowledge. The goal of the model is to get a better insight in the interaction between the LS and the CS as well as the overall behaviour when concrete or abstract concepts are processed. The model will be based on LASS theory for the word level and on LCCM theory for a reduced/simple sentence level.
The model shall be implemented as a neural network whose core will consist of two main self-organizing maps (SOM), LS and CS. LS gets its representational structure using LSA-like method capturing the statistical relationships among the words. In the CS the semantic structure is developed by (preselected) modal features in case of concrete concepts and by (human-made) associations in case of abstract concepts. The semantic structure of the CS will be created as a SOM that will build spatial relationships between neurons depending on the similarities and frequencies of the input patterns. Both systems, CS and LS will be mapped to each other by associative connections trained by Hebbian learning.
The words and concepts are selected so that taxonomic hierarchies can be built, which is a certain kind of building abstract concepts (other abstract concepts like freedom or democracy are much more difficult to represent). For example a higher level in this taxonomy is the very essential distinction between living beings (animals, plants) and non-living things/objects. The variation within these categories is big enough so that sub-categories can be built downwards until a basic level is reached where the concrete concepts are described with their modal features. Which semantic representation form shall be chosen for sentences is still under investigation but the most likely candidates are the recurrent neural network that will be trained to represent simple SVO (subject-verb-object) sentences, and an auto-associative network that can learn compressed representations of sentence frames.
The model will allow to simulate various linguistic processes such as word comprehension, lexical decision task, or word production. It will be tested against the empirical evidence, related to differences in processing concrete and abstract concepts. The model shall also take into account that abstract words/concepts are learnt later than concrete ones and the differences in the development of the two systems: CS precedes LS (at least at the very beginning) but later on both systems can influence each other during development.

References
Barsalou L., Santos A., Simmons W. & Wilson C. (2008) Language and simulation in conceptual processing. In: de Vega, Glenberg & Graesser (eds), Symbols and Embodiment: Debates on Meaning and Cognition, Oxford University Press, 245-283.

Evans V. (2006) Semantic representation in LCCM Theory. In: New Directions in Cognitive Linguistics, ed. By V. Evans & S. Pourcel. John Benjamins.

Louwerse M., Jeuniaux P. (2008) Language comprehension is both embodied and symbolic. In: de Vega, Glenberg & Graesser (eds) Symbols and Embodiment: Debates on Meaning and Cognition. Oxford University Press, 309-326.