Adaptive Behavior, 8 (2)

^ ISAB Home
> SAB '10
> New Login
> Log In
^ Journal
> Conferences
x Members
x News
> Joining ISAB
> ISAB Officers
> Contact ISAB

Adaptive Behavior

Volume 8, Number 2

Winter 2000

Table of Contents

 

Mike Campos, Eric Bonabeau, Guy Théraulaz and Jean-Louis Deneubourg

Dynamic Scheduling and Division of Labor in Social Insects

Adaptive Behavior, 8 (2), 83-96.

 

Tony Savage and Tom Ziemke

Learning and Unlearning Mechanisms in Animats and Animals

Adaptive Behavior, 8 (2), 97-128.

 

Olaf Sporns and Nikolaus Almássy and Gerald M. Edelman

Plasticity in Value Systems and Its Role in Adaptive Behavior

Adaptive Behavior, 8 (2), 129-148.

 

Frederick L. Crabbe and Michael G. Dyer

Goal Directed Adaptive Behavior in Second-Order Neural Networks: The MAXSON family of architectures

Adaptive Behavior, 8 (2), 149-172.

 

Anthony G. Pipe

An Architecture for Learning "Potential Field" Cognitive Maps With an Application to Mobile Robotics

Adaptive Behavior, 8 (2), 173-204.


Pages 83-96

Dynamic Scheduling and Division of Labor in Social Insects

By Mike Campos, Eric Bonabeau, Guy Théraulaz and Jean-Louis Deneubourg

Abstract

A method for assigning tasks or resources, based on a model of division of labor in social insects, is introduced and applied to a dynamic flow shop scheduling problem. The problem consists of assigning trucks to paint booths in a truck facility to minimize total makespan and the number of paint flushes. Similarities between the ant-based approach and a market-based approach are highlighted. Both systems are able to adapt well to changing conditions.


Pages 97-128

Learning and Unlearning Mechanisms in Animats and Animals

By Tony Savage and Tom Ziemke

Abstract

This paper addresses the assumption, implicit in many robot and animat models of learning, that learning and unlearning are a pair of symmetrical processes. Unlearning mechanisms supposedly erase or remove existing learning because it is no longer relevant. Whether learning and unlearning result from the operation of symmetrical and antagonistic processes is an issue which has had a long but uneven history in animal and human psychology. This history is briefly recapitulated here. In particular, there is a contrast in the significance of the antagonistic processes model in the area of motivation compared to associative learning which has theoretical significance. For example, animat modelers frequently adopt a generic strengthening and weakening mechanism for all forms of learning and motivation representations without any consideration for its biological and psychological validity. In order to evaluate and question this, we examine the unlearning concept in a number of artefactual models drawn from a range of robotic and artificial life perspectives, and discuss their validity in terms of contemporary models of animal learning and motivation. Finally, we outline an alternative view of learning/unlearning, based on a recent contingency model of causality learning in humans, which does not rely on antagonistic processes and may have applications for artefacts.

Key Words

learning/unlearning; strengthening and weakening mechanisms; motivation; dual-contingencies; causality


Pages 129-148

Plasticity in Value Systems and Its Role in Adaptive Behavior

By Olaf Sporns and Nikolaus Almássy and Gerald M. Edelman

Abstract

Adaptive behavior requires the sensing of salient behavioral consequences which can act to modulate changes in neural connections linking sensory and motor structures. In previous work, we proposed that salient sensory events trigger neuronal value systems capable of modulating synaptic plasticity. Here, we investigate the capacity of value systems to modulate their own responses in the context of various conditioning tasks. To this end, we implement a modifiable value system incorporating anatomical and physiological properties within. Darvin V, a neuronal model embedded in a mobile real world device. While exploring an environment containing stimulus objects, Darvin V's visual maps develop object-related neuronal responses. Phasic responses of a value system initially triggered only by object-"taste" (innate value) modulate changes in connections between visual and motor neurons, thus linking specific visual responses to appropriate motor outputs. Over time, Darvin V is able behaviorally to discriminate between "striped" objects with positive value (appetitive behavior) and objects with "blobs" with negative value (aversive behavior) based on vision alone. In parallel with modification of visuo-motor connections, value-dependent modification also occurs in connections from visual sensory maps to the value system itself. As a result, visual activity patterns become able directly to trigger value signals (acquired value). If acquired value is disabled, transfer of the value signal to stimuli preceding innately salient events does not occur, and behavioral responses due to aversive conditioning are subject to rapid extinction. If an auditory signal reliably precedes the visual appearance of an aversive object, Darvin V could be conditioned first to reject the object based on vision (primary conditioning), and subsequently based on sound alone (secondary conditioning). We compare the functional characteristics of value-dependent learning to formal notions of reinforcement learning. We suggest that plasticity in sensory afferents to value systems may provide a neurobiological basis for mediating the changing effects of saliency on adaptive behavioral responses.

Key Words

value; plasticity; conditioning; reinforcement; categorization; vision


Pages 149-172

Goal Directed Adaptive Behavior in Second-Order Neural Networks: The MAXSON family of architectures

By Frederick L. Crabbe and Michael G. Dyer

Abstract

The paper presents a neural network architecture (MAXSON) based on second-order connections that can learn a multiple goal approach/avoid task using reinforcement from the environment. It also enables an agent to learn vicariously, from the successes and the failure of other agents. The paper shows that MAXSON can learn certain spatial navigation tasks much faster than traditional Q-learning, als well as learn goal directed behavior, increasing the agent's chances of long-term survival. The paper shows that an extension of MAXSON (V-MAXSON) enables agents to learn vicariously, and this improves the overall survivability of the agent population.

Key Words

second-order neural network; simulated autonomous agents; reinforcement learning; vicarious learning


Pages 173-204

An Architecture for Learning "Potential Field" Cognitive Maps With an Application to Mobile Robotics

By Anthony G. Pipe

Abstract

The learning architecture described in this article autonomously acquires a topographical (metric) map that encodes a measure of "value" for xy-Cartesian locations in an environment. There are two reasons for the creation of low value areas. Direct negative reinforcement from the environment will result from the robot discovering obstacles or having other "unpleasant" experiences. The other source of negative reinforcement is internally generated by the learning algorithm, as it identifies regions that are a long distance away from the "pleasant" places in the environment. Conversely example "pleasant" places, where positive environmental reward is received, might be energy-charging sites or simply locations that the robot should visit in executing its daily tasks. In general what the robot learns is a map of "motivational" tendencies, or "expectancies". In such a map, the value attached to a place comes to reflect a balance between the good and bad rewards attainable from that position. When the Temporal Difference learning part of the architecture is turned on, that measure of value comes to include an estimate of how far, in travel time, it is to positive reinforcement. The architecture is loosely based on an Adaptive Heuristic Critic structure. Exploration of a continuous-valued search space is conducted by an Evolution Strategy, turned for fast and approximate optimization. Knowledge acquired autonomously from this exploration is stored in a Radial Basis Function (RBF) neural network. Inherent features of this neural network type lead to the creation of a "potential field" structure that experts appetive and aversive "forces" on the robot as it moves around in the environment. The results of simulation experiments are presented, with a view to illustrating the strengths and weaknesses of the architecture. The map building architecture proposed here is intended to form part of an overall navigational system. In future work it will be integrated with a self-localization algorithm, landmark-based topological mapping, and a reactive system for dealing with local dynamics in the environment.

Key Words

cognitive maps; mobile robotics; potential field maps; adaptive heuristic critic; reinforcement learning; evolutionary computation; neural networks



back to TOC, back to top

08:13 GMT; 12/01/10
Comments or Questions? Contact Us..                 Copyright © 2010, ISAB.   All rights reserved.