Operant Conditioning and Behaviorism - an historical outline
Around the turn of the century, Edward
Thorndike attempted to develop an objective experimental method for
the mechanical problem solving ability of cats and dogs. Thorndike devised
a number of wooden crates which required various combinations of latches,
levers, strings and treadles to open them. A dog or a cat would be put
in one of these 'puzzle-boxes' and, sooner or later would manage to escape
from it. Thorndike's initial aim was to show that the anecdotal achievements
of cats and dogs could be replicated in controlled, standardised circumstance,
however, he soon realised that he could now measure animal intelligence
using this equipment. His method was to set an animal the same task repeatedly,
each time measuring the time it took to solve it. Thorndike could then
compare these 'learning-curves' (see figure below) across different situations
and different species.
Thorndike was particularly interested in discovering whether his animals
could learn their tasks through imitation or observation. He compared the
learning curves of cats who had been given the opportunity of observing
others escaping from a box with those who had never seen the box being
solved and found no difference in their rate of learning. He obtained the
same null result with dogs and, even when he showed the animals the methods
of opening a box by placing their paws on the appropriate levers and so
on, he found no improvement. He fell back on a much simpler trial and error
explanation of learning. Occasionally, quite by chance, an animal performs
an action which frees it from the box. When the animal finds itself in
the same position again it is more likely to perform the same action again.
The reward of being freed from the box somehow strengthens an association
between a stimulus, being in a certain position in the box, and an appropriate
action. Reward acts to strengthen stimulus-response associations. The animal
learns to solve the puzzle-box not by reflecting on possible actions and
really puzzling its way out of it but by a quite mechanical development
of actions originally made by chance. By 1910 Thorndike had formalised
this notion into a 'law' of psychology - the law of effect. In full it
reads: "Of several responses made to the same situation those which
are accompanied or closely followed by satisfaction to the animal will,
other things being equal, be more firmly connected with the situation,
so that, when it recurs, they will be more likely to recur; those which
are accompanied or closely followed by discomfort to the animal will, other
things being equal, have their connections to the situation weakened, so
that, when it recurs, they will be less likely to occur. The greater the
satisfaction or discomfort, the greater the strengthening or weakening
of the bond." Thorndike maintained that, in combination with the law
of exercise, the notion that associations are strengthen by use and weakened
with disuse, and the concept of instinct, the law of effect could explain
all of human behavior in terms of the development of myriads of stimulus-response
associations. It is worth briefly comparing trial and error learning with
classical
conditioning. In classical conditioning a neutral stimulus becomes
association with part of a reflex (either the US or the UR). In trial and
error learning no reflex is involved. A reinforcing or punishing event
(a type of stimulus) alters the strength of association between a neutral
stimulus and quite arbitrary response. The response is not to any part
of a reflex.
The behaviorist position that human behavior could be explained entirely
in terms of reflexes, stimulus-response associations, and the effects of
reinforcers upon them entirely excluding 'mental' terms like desires, goals
and so on was taken up by John
Broadhus Watson in his 1914 book 'Behavior: An Introduction to Comparative
Psychology.'. Watson had also been involved in the introduction of the
most favoured subject in comparative psychology - the laboratory rat. One
of his early jobs which he used to fund his Ph.D. was as a caretaker, one
of whose duties was to look after laboratory rats used in studies intended
to mimic 'real-life' learning tasks such as navigating complex mazes. Watson
became adept at taming rats and found he could train rats to open a puzzle-box
like Thorndike's for a small food-reward. He also studied maze-learning
but simplified the task dramatically. One type of maze is simply a long
straight alley with food at the end. Watson found that once the animal
was well trained at running this 'maze' it did so almost automatically.
Once started by the stimulus of the maze its behavior becomes a series
of associations between movements (or their kinaesthetic consequences)
rather than stimuli in the outside world. This is made plain by shortening
the alleyway - the well-trained rats now run straight into the end wall.
This was known as the kerplunk experiment. The development of well-controlled
behavioral techniques by Watson also allowed him to explore animals sensory
abilities, for example their abilities to discriminate between similar
stimuli, experimentally. Watson's theoretical position was even more extreme
than Thorndike's - he would have no place for mentalistic concepts like
pleasure or distress in his explanations of behavior. He essentially rejected
the law of effect, denying that pleasure or discomfort caused stimulus-response
associations to be learned. For Watson, all that was important was the
frequency of occurrence of stimulus-response pairings. Reinforcers might
cause some responses to occur more often in the presence of particular
stimuli, but they did not act directly to cause their learning. Watson
could therefore reject the notion that some mental traces of stimuli and
responses needed to be retained in an animals mind until a reinforcer caused
an association between them to be strengthened, which is a rather mentalistic
consequence of the law of effect.
Publishing his second book 'Psychology from the Standpoint of a Behaviorist'
in 1919, Watson became the founder of the american school of behaviorism.
His rejection of mentalism was total. He felt that thought was explicable
as subvocalisation and that speech was simply another behavior which might
be learned by the law-of effect. In 'Psychology from the Standpoint of
a Behaviorist' he addresses a number of practical human problems such as
education, the development of emotional reaction and the effects of factors
like alcohol or drugs on human performance. He even suggests that thought
processes might be investigated by monitoring movements in the larynx.
Watson believed that mental illness was the result of 'habit distortion'
which might be caused by fortuitous learning of inappropriate associations
which then go on to influence a person's behavior so that it become ever
more abnormal. Watson tested part of this hypothesis on a baby in the hospital
in which he worked. The baby, 'little Albert', apparently showed no particular
fears or phobias about anything apart from sudden loud sounds. For example,
when Watson placed a tame white rat in little Albert's lap the child happily
played with the animal. On a subsequent occasion Watson placed the rat
in Albert's lap and his assistant made a loud noise by striking a large
steel bar directly behind Albert's head. One week later Albert was subjected
to the same experience. After this, when Albert was showed the rat be began
to fret, appearing anxious. Similar reactions were produced by other furry
objects (a fur coat). Watson was keen to use this as evidence for the behavioral
basis of phobias, however, apparently Albert's reactions to the rat were
quite mild. Nevertheless, one of the most widespread applications of conditioning
has been in the treatment of phobias and other behavior problems and the
case of Little Albert is often cited as the first experiment in this field.
In the 1920's behaviorism began to wane in popularity somewhat. A number
of studies in the Berkeley laboratory of Edward Tolman appeared both to
show flaws in the law of effect and require mental representations in their
explanation. For example, rats were allowed to explore a maze in which
there were three routes of different lengths between the starting position
and the goal. The rats behavior when the maze was blocked implied that
they must have some sort of mental map of the maze. The rats prefer the
routes according to their shortness, so, when the maze is blocked at point
A, stopping them using the shortest route, they will choose the second
shortest route. When, however, the maze is blocked at point B the rats
does not retrace his steps and use route 2, which would be predicted according
to the law of effect, but rather uses route 3 . The rat must be recognising
that block B will stop him using route 2 by using some memory of the layout
of the maze. Tolman's group also showed that animals could use knowledge
they gained learning a maze by running to navigate it swimming and that
unexpected changes in the quality of reward could weaken learning even
though the animal was still rewarded. This result was developed further
by Crespi who, in 1942, showed that unexpected decreases in reward quantity
caused rats temporarily to run a maze more slowly than normal while unexpected
increases caused a temporary elevation in running speed.
At the same time as this work was appearing in the USA the Polish psychologists
Konorski and Miller began the first cognitive analyses of classical
conditioning - the forerunners of the work of Rescorla, Wagner, Dickinson
and Mackintosh. In Germany Wolfgang Koehler was studying insight and observation
as mechanisms of learning in Chimps. All work which was quite problematic
for behaviorism.
In 1938 Burrhus
Friederich Skinner published what was arguably the most influential
work on animal behavior of the century 'The Behavior of Organisms'. In
the interim it had been shown that Tolman's results were sensitive to factors
like the openness of his maze - if the rats could not see stimuli outside
the maze they did not make appropriate choices when it was blocked, suggesting
that they may have learned many stimulus response associations in different
parts of the maze, perhaps in sequence, rather than having internalised
a map of it. Skinner resurrected the law of effect in more starkly behavioral
terms and provided a technology which allowed sequences of behavior produced
over a long time to be studied objectively. His Skinner-Box
was a great improvement on the individual learning trials of Watson
and Thorndike. Skinner developed the basic concept of operant
conditioning, claiming that this type of learning was not the result
of stimulus-response learning - for Skinner the basic association in operant
conditioning was between the operant response and the reinforcer, the discriminative
stimulus served to signal when this association would be acted upon.
This document was restructured from
a lecture kindly provided by R.W.Kentridge |