Behavior
and Task Learning from Demonstration
The move of robots from industrial
to everyday environments like hospitals and homes introduces a number of
demanding requirements to handle uncertainty and changing external conditions. Still,
most robots are constructed to perform pre-defined tasks, programmed by a
researcher or engineer. However, most robots of tomorrow will have to be
constructed for a large variety of tasks to become economically attractive. The
adaptation to new tasks can not be expected to be made with regular end-user
programming. Rather, the robot has to be delivered with advanced capability to
learn new tasks and new working conditions, adapted to the present environment.
Robot learning is for this reason a very active research area.
A popular method for
teaching robots simple behaviors involves a human demonstrating a behavior via
remote control, or by having the robot observe the human's movements. This
approach is commonly referred to as Learning
from Demonstration (LFD) or Imitation
Learning (IL).
The goal of LFD is to
create a representation of a behavior such that the robot, when executing the
taught behavior, ends up executing a certain behavior. The meaning of a behavior
is in this context very general and can be anything from surviving in the
environment to a specific task such as taking out the garbage.
This project focuses on the
theoretical aspects of robot learning, and specifically learning from
demonstration using teleoperation. In order to interpret
a demonstration, the robot has to have some knowledge of how to extract the
relevant aspects of the demonstration. This previous knowledge, or bias, can in
turn be the result of previous learning. From this view, learning is seen as a
gradual development of both knowledge and bias where learning is only successful
when there is suitable bias available, i.e., when the taught task is not too
easy and not too hard.
Previous knowledge is
commonly stored as behavior primitives, which can be either learned or hard-coded
by a programmer. A demonstrated behavior can then be broken down into segments
that correspond to one primitive behavior. Primitive behaviors
is implemented by already learned or hard-coded controllers that can be
combined into new, more complex behaviors. This transforms learning from
demonstration into three basic activities, behavior
segmentation, behavior recognition and behavior coordination (Billing 2007,
Billing & Hellström 2008b).
Behavior segmentation
refers to the process of dividing the observed event sequence into segments
which can be explained by a single primitive. Behavior recognition involves
identifying which primitive, with possible parametrization,
that best matches each segment. Finally, behavior coordination involves
identifying switching criteria between primitives, and how the primitives
should be composed. Identification of switching criteria corresponds to finding
sub-goals in the demonstrated behavior.
So far, a broad overview of
robot learning and behavior representation including a formalisation of LFD has
been performed within the present project (Billing 2007, Billing &
Hellström 2008b). Furthermore, three general methods for behavior recognition have
been developed and tested (Billing & Hellström 2008a).
Recently, the focus of the
project has turned towards functional models of cortex and how neurological
models can be applied within LFD, specifically to develop a common
understanding between robot pupil and human teacher. This includes how attentional processes can be applied during learning as a
way to infer bias, and support the extraction of relevant features from
demonstration.