| In principle, and sometimes in practice, it is possible
for a rat to learn to press a bar in a Skinner-box by trial and error.
If the box is programmed so that a single lever-press causes a pellet to
be dispensed, followed by a period for the rat to eat the pellet when the
discriminative-stimulus light is out and the lever inoperative, then the
rat may learn to press the lever if left to his own devices for long enough.
This can, however, often take a very long time. The methods used in practice
illustrate how much the rat has to learn to tackle this simple instrumental
learning situation. The first step is to expose the rat to the food pellets
he will later be rewarded with in the Skinner box in his home cage when
he is hungry. He has to learn that these pellets are food and hence are
reinforcing when he is hungry. Now he can be introduced to the Skinner-box.
Initially there may be a few pellets in the hopper where reinforcers
are delivered, plus a few scattered nearby, to allow the rat to discover
that the hopper is a likely source of food. Once the rat is happy eating
from the hopper he can be left in Skinner box and the pellet dispenser
operated every now and then so the rat becomes accustomed to eating a pellet
from the hopper each time the dispenser operates (the rat is probably learning
to associate the sound of the dispenser operating with food - a piece of
classical conditioning
which is really incidental to the instrumental learning task at hand).
Once the animal has learned the food pellets are reinforcing and where
they are to be found, it would, however, still probably take some time
for the rat to learn that bar-pressing when the SD light was on produced
food. The problem is that the rat is extremely unlikely to press the lever
often by chance. In order to learn an operant contingency by trial and
error the operant must be some behavior which the animal performs often
anyway. Instead of allowing the rat to learn by trial and error one can
use a 'shaping' or 'successive-approximations' procedure. Initially, instead
of rewarding the rat for producing the exact behavior we require - lever
pressing - he is rewarded whenever he performs a behavior which approximates
to lever pressing. The closeness of the approximation to the desired behavior
required in order for the rat to get a pellet is gradually increased so
that eventually he is only reinforced for pressing the lever. Starting
by reinforcing the animal whenever he is in the front half of the Skinner-box,
he is later only reinforced if he is also on the side of the box where
the lever is. After this the reinforcement occurs if his head is pointing
towards the lever and then later only when he approaches the lever, when
he touches the lever with the front half of his body, when he puts touches
the lever with his paw and so on until the rat is pressing the lever in
order to obtain the reinforcer. The rat may still not have completely learned
the operant contingency - specifically he may not yet have learned that
the contingency between the operant response and reinforcement is signalled
by the SD light. If we now leave him to work in the Skinner-box on his
own he will soon learn this and will only press the lever when the SD light
is on.
This article is restructured from
a lecture kindly provided by R.W.Kentridge. |