Tuesday, October 31, 2006

Convergent Cognitive Evolution and Consciousness

Self awareness is considered to be a faculty of those animals which are at the top of the 'cognitive hierarchy', and is almost synonomous with presence of what could consider as intelligence (in animals anyway), not to mention its implications for consciousness. So what animals exhibit self-awareness? In this paper by Plotnik et al (PNAS, current issue), the possibility of elephants (in this case asian elephants) displaying self awareness was examined. They used the standard mirror test - a cross (or mark) is placed on the animal in a location where it would be impossible to see without the aid of a mirror (in this case above one eye). When a mirror is present, animals displaying self-awareness would (eventually) realise that the thing it sees in the mirror is itself, and thus examine the foreign mark on its own body, and not on its reflection. See paper for further details.

What was found was that a single elephant displayed this self-aware behaviour, out of a group that was tested. Not so impressive at first blush perhaps. But when it is considered that only around half of monkeys tested using this procedure display self-awareness, the results of this study carry more weight. Previous to this study, self awareness was considered to be the domain of humans, monkeys, and dolphins. The behaviour of the elephant in this study was comparable to the behaviour of these other animals.

In the discussion of their results, the authors suggest that self awareness capabilities may be indicative of a clear distinction between self and other, which would be necessary for social interaction. Thus the suggestion that convergent cognitive evolution has occured in these species: that the 'property' of self awareness for an agent may be emergent from the need to interact socially. In other words, social interaction is a driving factor for cognitive evolution, to a large extent if these results are to be taken into account. This would have huge implications for the study of consciousness, in both biological and artificial systems. If self-awareness is considered to be a basic type of consciousness, then the possibility arises that the only reason consciousness emerged was due to the development (perhaps through other evolutionary driving factors) of social interaction networks and hierarchies.

Monday, October 30, 2006

Laughs and Giggles

Not in any way related to research of any form, but just wanted to share a website someone told me about today. I imagine most know of it already, but anyway:

Piled Higher and Deeper: An online graduate student comic strip series. It's been going for years, and it's quite funny. After a brief browse through the archives, there are two which tickle me: Anatomy of a group meeting presentation, and Beyond the Scope of Research. I'm sure there are many more...

Thursday, October 26, 2006

Book: Spiking Neuron Models

Link to an on-line book: Spiking Neuron Models: Single Neurons, Populations, Plasticity
Wulfram Gerstner and Werner M. Kistler Cambridge University Press (August 2002)

The book is an introduction to spiking neuron models aimed at graduates - in fact, the authors say that they used it as lecture material; one chapter per lecture. It would also then naturally be useful as a general reference book. Although I have not yet had a chance to read it, I do fully intend to, as it seems to be ritten at the right level, and in a genuinely instructive manner. The introduction lays out all of the concepts necessary for the understanding of the rest of the book (an introduction to neurons and their models, and an overview of rate and spike codes, for example). The rest of the book is divided into three parts, which are described concisely in the Preface:

In order to understand signal transmission and signal processing in neuronal systems, we need an understanding of their basic elements, i.e., the neurons, which is the topic of part~I. New phenomena emerge when several neurons are coupled. Part~II introduces network concepts, in particular pattern formation, collective excitations, and rapid signal transmission between neuronal populations. Learning concepts presented in Part~III are based on spike-time dependent synaptic plasticity.

Wednesday, October 25, 2006

Links #1

A couple of links I found today which may be of interest:

- AGIRI: Artificial General Intelligence Research Institute homepage. The institutes aim is to "foster the creation of powerful and ethically positive Artificial General Intelligence". The online discussion forum may be interesting.

- MachinesLikeUs: What seems to be a relatively new website on matters concerning evolutionary thought, cognitive science, and artificial life and intelligence. On the welcome page, it lists four concepts which it aims to promote, the second of which ("Religions and their gods are human constructs, and subject to human foibles") seems to me to be unnecessary and even counterproductive in the context of investigating fundamental cognitive processes and the foundations of intelligence. Having said that, the discussion forum has some interesting topics (on consciousness for example).

- As I mentioned in my last post, The Brain 0 Project webpage contains a number of resources which explains the concepts I attempted to briefly review yesterday: that the human brain may be considered a Universal Learning Machine.

Dynamically Reconfigurable Universal Learning Neurocomputer

In this lecture by Victor Eliashberg, I was introduced to the idea of the brain as a universal learning computer. The lecture itself, having not been familiar with any of the material covered, I found quite confusing, with seemingly quite a few disparate elements 'thrown together'. Having conducted a little further reading into the topic, however, it became quite clear what the fundamental thesis of the work was.

The underlying motivation is the assertion that in order to understand the brain, one must study it as a whole, and not decompose it into functional regions for relatively independent study (the major approach used in cognitive psychology). The reason for this decomposition in most studies on the brain is that it is practically impossible to study the human brain in its entirety when fully developed – and it is this state where it exhibits behaviours which are of interest to psychologists etc. Given this, it would then seem reasonable to study the brain in its 'starting state', i.e. the state in which learning has not yet begun.

Given that 'W' represents the world, 'B' represents the brain, and 'D' represents the body (of the human), such that a system (W, D, B) is one which we wish to describe (in other words the behaviour of the human in the world), and that B(t) is the formal representation of B at time t (such that B(0) is the brain in its starting state), then four propositions are made:

(1) It is possible to find a relatively small representation of B(0). It is speculated that this amount of information may fit on a single floppy disk (?!).
(2) Any formal representation of B(t) when t is large (on the scale of years perhaps) would be huge (possibly terabytes). This is due to the presence of the persons personal experiences.
(3) It is practically impossible to reverse engineer B(t) (when t is large) without first reverse engineering B(0).
(4) It is practically impossible to model the behaviour of the system (W, D, B) without a representation of B(t).

Based on these four propositions, the project (known as Brain 0) is to reverse engineer the starting state of the brain. Eliashberg provides a possible model: that of a universal learning computer, based on a universal Turing machine.

Tuesday, October 17, 2006

A Note on the Candlestick approach to writing Project Proposals

I was talking to my supervisor today about the structure of the perfect project proposal, in our case for a grant, but it would probably apply to any kind. A fairly standard view is the funnel approach, which is where you start with the broad background, and then gradually hone in on your precise aims, objectives, issues of interest etc. However, this misses something, we mused. What you also need, to finish it off, is to provide the applications for the work, to enable the reader/reviewer to really see why the work is necessary/important/relevant. The sort of stuff you can't put in the scope, as it's a bit vague or ill-defined. Hence the candlestick approach. You start with the scope of the project, narrow down to the aims, objectives, deliverables, issues of interest, etc., before finishing with a summary of the aims, and the broad implications/applications of the work as the base of the candlestick.

Monday, October 16, 2006

Book: Models of Working Memory

In this book, edited by Akira Miyake and Priti Shah, ten contemporary ‘theories’ of working memory (WM) are outlined. I say ‘theories’ because there are a great number of overlaps between the presented chapters, and so whilst each presents a view in its own right, a number are based upon the same principles. The ten chapters cover the following theories/models (with a very brief synopsis of each):

- Baddeley’s tripartite model of working memory: WM as a functionally and structurally separable memory system from long-term memory. The well known phonological loop and visuospatial sketchpad slave systems controlled by the attentional controller, the central executive. More recently, this model has been appended with the episodic buffer – invoked to help explain the growing evidence of more than just cooperation between the working and long-term memory systems.
- Nelson Cowan’s embedded processes model of working memory: inspired by the Craik and Lockhart levels of processing information processing architecture, this theory states that WM is simply that region of long-term memory which is in an activated state. Furthermore, a focus of attention is a subset of this, and contains the contents of conscious awareness at any given time. Again, a central executive is used as an attention controlling mechanism, among other things.
- Engle et al’s individual differences model of working memory: very closely related to both Cowan and Baddeley’s models, it views WM as long-term memory plus attention. As the name suggests though, it emphasises the differences in capacity between individuals.
- Lovett et al’s working memory in ACT-R: the ACT-R cognitive architecture is essentially made up declarative memory (networked nodes, each representing a piece of knowledge, and each having an activation level – this level representing attention), and procedural memory (symbolic, of the type condition->action). Given a current goal, working memory would consist of all those production rules (from procedural memory) and those declarative nodes activated by those rules which are relevant to the current goal. It is a computation model, with development tools freely available on the internet.
- Kieras et al’s working memory in the EPIC architecture: a symbolic computational architecture with a well developed interface with perceptual and motor processes. Though not explicitly a model of WM, as with ACT-R and SOAR, it does nevertheless operate in certain domains in a way functionally analogous to WM.
- Young et al’s working memory in SOAR: SOAR is a purely symbolic computation model of human cognition. Its working memory space contains the current goal (broken down into subgoals if necessary), and those pieces of information relevant to it/them. Production rules relevant to the goal at the top of the ‘goal stack’ all fire at once, and new rules may be created if no relevant rules are found. Processing, as with ACT-R, is thus goal directed.
- Ericsson and Delaney’s long term working memory: This theory of working memory distinguishes between the oft-discussed short-term working memory discussed by most other theories, and long-term working memory which is made up of those knowledge structures which enable the fast recall from nonactivated long-term memory. These structures are studied by examination of ‘memory experts’, for example people who can remember long series of numbers, which theoretically exceed the limits of standard ‘short-term’ working memory.
- Barnard’s working memory in ICS: The interacting cognitive systems model is another general cognitive architecture, however, it began life as a model of Baddeley’s phonological loop. Baddeley’s tripartite structure maps very well on top of it, although each system is described as the interaction of a number of other, more fundamental, modality specific systems.
- Schneider’s working memory in CAP2: The control and automatic processing model is a connectionist model that started life as specifically a model of working memory, but became a general cognitive architecture. In this architecture, a central executive is presented as a hierarchically structured series of neural networks.
- O’Reilly et al’s biologically based computational model of working memory: This model is based on the assumption that the effect of working memory is an emergent property of the interaction of three specialised brain systems; the prefrontal cortex, the hippocampus, and the perceptual and motor cortices. I cannot do justice to the theory here; would like to review it in a later post. On a quick note: it forms the basis of a computational model of the prefrontal cortex, which has been successfully implemented in robotic setups (presented at COGRIC recently). I also intend to post on this subject in the near future.

When I first approached the subject of WM for my research, Baddeley and Hitch’s tripartite model seemed by far the most influential. This is indeed the case – however, a multitude of models were always present even if just at the fringes. One thing I found most disconcerting was the lack of a standard definition of working memory: it seemed to be used to cover a wide range of things. More recently, however, this prevalence of Baddeley’s model seems to be declining, in favour of what may be described as more functionally general and neurally more explicit models. This book I found extremely useful as a starting point in trying to resolve the issue of definitions in particular, and to provide an overview of the majority of contemporary working memory theories. I highly recommend it. It even provided me with a brief introduction to the cognitive architectures of ACT-R and SOAR. Partly as a result of reading this book, and in conjunction with other research, I settled on a general definition of working memory upon which my current research is aimed: working memory is the interface between cognition on the one hand, and memory on the other.

REF: A. Miyake and P. Shah, "Models of Working Memory: mechanisms of active maintenance and executive control." Cambridge: Cambridge University Press. 1999.

Wednesday, October 11, 2006

Language and Personality - idle thoughts

Having read a number of posts on Developing Intelligence about Language and Cognitive development in general (which, if I might add, I must recommend), I was reminded of something that I was interested in a few years back – the influences of language on what would be best described as personality, especially on those lucky enough to be described as genuinely bilingual. These are just a few thoughts of mine, in no way scientifically verified, simply conjectures based on personal experience and conversations with others.

My Mother Tongue is English. I can also speak Dutch and French – although not in any way trilingual, I am perfectly capable of a lot more than basic communication with natives of the two languages. One thing I have noticed when speaking Dutch over the period of a few days, is that I notice a slight difference in behaviour, even attitudes, in myself. Now this is of course pure speculation, speculation which friends of mine agree with when it is pointed out to them. This 'phenomenon' seems at first glance rather trivial, and indeed I have treated it as such, until I read “Language and Thinking-For-Speaking” on Developing Intelligence. The paper that this post describes is concerned with the relative classification of nouns with respect to gender specific qualities, tested on people whose languages have grammatical gender (German and Spanish). For example, subjects were asked to give adjectives describing a series of words, for instance 'Key' (masculine in German; feminine in Spanish). If I might borrow the same quote that Chris at Developing Intelligence used to illustrate the results:

"There were also observable qualitative differences between the kinds of adjectives Spanish and German speakers produced. For example, the word for "key" is masculine in German and feminine in Spanish. German speakers described keys as hard, heavy, jagged, metal, serrated, and useful, while Spanish speakers said they were golden, intricate, little, lovely, shiny and tiny. The word for "bridge," on the other hand, is feminine in German and masculine in Spanish. German speakers described bridges as beautiful, elegant, fragile, peaceful, pretty and slender, while Spanish speakers said they were big, dangerous, long, strong, sturdy and towering."

Further discussion of this paper suggests that language may have an effect on non-linguistic thought in a profound way. Given my personal experience, and more general observations, my query is whether truly bilingual or trilingual persons may possibly display slightly different personality characteristics when in a particular 'language mode'. Based on this paper, that answer seems to be a resounding maybe – or rather it leaves this phenomenon as a possibility rather than an impossibility. As I've mentioned, these are thoughts of mine, not empirically proved theories. However, I do think it would be interesting to see if studies were conducted with multilingual people were conducted to assess this. Quite how personality traits would be 'measured', I am not sure. Perhaps like/dislike judgements in combination with other standard psychological tests. These people would naturally have to be 'immersed' within the testing language before any testing could be useful, and course the multitude of other factors, such as differences in contemporary culture between the language groups – but I maintain that it would be interesting at the very least if conducted properly. As a tongue-in-cheek final question: Are multilinguists all suffering from a latent multiple personality disorder?

Tuesday, October 10, 2006

Cortical Dynamics of Working Memory

Notes on a lecture given by Joaquin Fuster (2006), at the Almaden Institute Conference on Cognitive Computing - http://video.google.com/videoplay?docid=-3002336180397686566&q=almaden+cognitive+computing.

This fascinating lecture provides of summary of Prof. Fusters work on cortical dynamics, in this case focusing on working memory. He defines working memory at the start as the active (online) retention of information for a prospective action to solve a problem or to reach a goal (with the emphasis on prospective action). This is what I consider to be a fairly generic definition of what the purpose of working memory is - although naturally, over the course of my research, the systems/dynamics underlying this statement vary quite significantly. From this basic definition, two further qualifications are presented: Firstly, the actively held information is unique for the present context, but is inseparable from past context - in essence, it is long-term memory updated for prospective use. Secondly, working memory and long-term memory share much of the same cortical structure. This distinction I will dwell on
for a little. The implication of this statement is first that long-term and working memory are functionally distinct systems; secondly, significant common ground between the two memory systems exists, showing that while they may be functionally separable, they are not neurally separable. If I may mention the 'classic' working memory (WM) theory of Baddeley and Hitch, one can see that the functional separability is very clearly defined (due to it being based on behavioural studies) in Baddeley's WM, whereas the second point seems to have been largely neglected, until recently at least (with the introduction of the episodic buffer). On a final note, Prof. Fuster notes that working memory may also be described as attention towrds an internal cue - an interesting note in itself.

One thing that I am still slightly bewildered by, and something which I would be glad of comments on, is the difference between working memory on the one hand, and short-term memory on the other. The way I currently understand it, short-term memory may be seen as somewhat of a special case of working memory. Together with the basic sensory stores (the persistence of sensory 'representations' for a short period of time after the occurrence of the stimulus), working memory replaced the Atkinson and Shriffrin view of a short-term memory store. Corrections and other points of view gladly accepted on this point.

Points of note in the presentation (and approximate time in minutes):

- 20mins: when describing the overlapping cortical hierarchies of the frontal lobe and posterior sensory regions, Fuster makes reference to the fact that Semantic memory is a result of the abstraction over time of individual experienced instances (episodic memory). Given that Endel Tulving defined episodic memory as a subset of semantic memory, there seems to be a large discrepancy between the now classical view of the structure of declarative memory, and the theories presented by Prof. Fuster. On the other hand, more recent evidence has suggested that episodic and semantic memory essentially work in parallel (see here - http://www.cus.cam.ac.uk/~jss30/pubs/Graham2000%20Neuropsygia.pdf).

- 31mins: a description of the Cortical Perception-Action Cycle - a fundamental principle of neural operation.

- 48mins: The description of constraints on the modelling of the memory network. Two types of constraint are described - structural and dynamic terms. In structural terms, the model would have to have a network structure, would have to deal in a relational code (resultant from the fact that memories are defined by associations, at all levels), it must be both hierarchical and heterarchical (in other words, the interaction of multiple hierarchies in parallel - the overlapping cortical hierarchies), and finally it must be capable of plasticity (the networks must not be static, they must be able to change). In Dynamical terms, the memory must be content addressable (be addressable by content or by association), it must accomodate variability (be stochastic), the system must be capable of reentry (that is, it must be able to update its own 'knowledge', or long-term memory - hence plasticity), the obvious necessity for parallel processing (he gives the example of perceptual processing), although having said that, there is a necessity for serial processing when it comes to conscious attention (the Global Workspace Theory of Bernard Baars springs to mind here...), and finally, the model must be able to accomodate category, in both perception and action (another spring to mind: Krichmar and Edelman's brain based device work http://cercor.oxfordjournals.org/cgi/content/abstract/12/8/818). Although Prof. Fuster notes that these are his personal views on constraints, in my humble opinion, they deserve at least acknowledgement when it comes to such model building, due to his vast experience and research credentials.

- Animations of brain activations of patients performing delayed matching to sample tasks (the basic working memory task for humans) in various modalities: visual (54mins), spatial (56mins), verbal/auditory (57mins). This are particularly impressive, and display beautifully what he said previously concerning the overlapping cortical networks - I must recommend them.

- 70mins: a question regarding the relationship between perception and memory - essentially the answer described that perception is formed by memory, thus aligning with the theories of active perception, or expectation-driven perception/attention.

As a final note, I would just like to mention his conclusions. Firstly, that memories are defined by associations at all levels. Secondly, that hierachical networks for perceptual memory reside in posterior regions, whereas executive memory networks reside in the frontal cortex. Thirdly, that the prefrontal cortex (at the top of the perception/action cycle hierarchy) mediates cross temporal contingencies (as eloquently argued in his first book "The Prefrontal Cortex"). And finally, working memory is maintained by recurrent activity between the prefrontal cortex and the perceptual posterior regions - as shown in those animations.

Tuesday, October 03, 2006

The beauty of "Biologically Non-Implausible"

The expression "biologically non-implausible"is a phrase I first heard used by Murray Shanahan at COGRIC (http://www.cogric.reading.ac.uk), in description of his work on cognitive architectures. In terms of the work of cognitive roboticists such as Shanahan, and many others, I think the term is perfectly suited.

The phrase "biologically inspired" is well known, and used extensively by a wide range of activities which carries out some activity, or process, in a way which takes some idea from a natural biological system - although the final instantiation often bears no resemblance to the system from which it took inspiration. And of course, there is no necessity (or obligation) for this: in this way, a task may be completed where it previously would have been overly complex. For example a number of search and optimisation algorithms inspired by the workings of an ant colony when searching for food/an alternative nesting site.

Onto the phrase "biologically plausible", and what you get is something with an entirely different connotation. This expression implies not just that this system or other is possibly instantiated in an actual biological system, but also that this system or other is likely to be instantiated in the actual biological system of interest. I've come across this statement in numerous articles and papers, although not necessarily so explicitly stated. So, for example, based on the activation of such and such a brain region when a patient performs a certain task, it is biologically plausible that this brain region performs function x. The use of the term in this context I think is perfectly justified. However, when it comes to building cognitive models - perhaps making a model of how a certain task is performed within the brain - I get a little uncomfortable with the term biologically plausible. This, to me at least (and I'm sure many may disagree), seems to overstate the significance of the created model. Such a model may well describe how a specific task is completed in the greatest detail, but to then say that it is biologically plausible, and then perhaps search for corroborating neural analogues for each processing, seems to overstate it. In my view, it would surely be unwise to claim biological plausibility for a model of a certain task in a specific domain, until it is known how this task fits in the context of the entire human information processing brain; perhaps there may be numerous ways in which such a task may be completed, none of which are intuitively possible until other aspects of brain function are enlightened?

And so the expression "biologically non-implausible". This, to me, implies that the presented statement, or system, lies within the realms of possible explanations for the functioning of the biological system, rather than being the most likely candidate. Basically, this argument boils own to emphasis - or more precisely, what point of view I perceive these two expressions to emphasise. Largely pedantic perhaps, but in my view an important qualitative distinction. I'm really just justifying why I like the expression "biologically non-implausible".