The term “shaping” is what J. B. Skinner called the method of achieving progressive
changes of behaviors. It is usually used in positive reinforcement methods of
operant conditioning. The term itself brings to mind the creation of a complete
picture, one small brush stroke at a time. The analogy to creating a painting
is appropriate as to change behavior using shaping as a tool, the whole process
is a creative one which works toward the ultimate “picture” of what is desired.
In behavior work, we are not dealing with a static object but individuals with
minds and behaviors of their own. Shaping is reinforcing behavior that is
already occurring so that it will reoccur more frequently.
Shaping uses what Skinner termed
successive approximations in the process. This means that the behavior that is
progressively closer to the ideal picture in the trainer’s mind is rewarded.
For instance, to shape a dog into sitting in a perfect heel position, the steps
may be as follows: reward for a sit at the side (which is probably crooked and
not close at all) and then each time the dog sits a little closer to the ideal,
it is rewarded. It is not rewarded any time that it moves farther out or more
crooked than it had been. It is important to remember that in operant
conditioning, it is the subject/student who initiates the changes in behavior.
There will be periods of trial and error learning. If the undesired behavior is
ignored and only the desired/closer approximation to what is desired is
rewarded, then the behavior will be then shaped into what is desired.
The successive approximations are
also used in a larger context to shape a series of learned behaviors/tasks.
These are termed behavior chains. Teaching a behavior chain will not be
successful, however, if even one of the components in the chain has not been
solidly learned or the behavior of the subject (eg.
dog) has not been brought under stimulus control. Dogs who successfully learn
behavior chains become multi-tasking individuals such as service dogs,
obedience dogs, movie dogs, and search and rescue dogs.
The secret to behavior chains is
teaching the last behavior in the chain first. The subject is rewarded after
this behavior and learns to look for the reward after this particular behavior.
Then the second to last behavior is taught coupled with the last behavior
before the reward. Other behaviors are then added in front of the previously
learned behavior until the subject has learned a series of tasks/behaviors
before being rewarded.
To illustrate, consider the many
behaviors and steps in the retrieve on the flat. In the full exercise, the dog
must learn to accept the dumbbell, carry it, relinquish it upon command as well
as physically fetch the dumbbell. The dog must also know the finish or return
to heel exercise. The steps that I use to teach retrieve on the flat are:
1.
opening the mouth, holding the
dumbbell
2.
giving the dumbbell on command
3.
reaching for the dumbbell
4.
picking up the dumbbell off the
ground
5.
going to the dumbbell on the
ground with handler
6.
going to dumbbell on the ground
without handler
7.
returning to handler with
dumbbell in mouth
8.
straight sit in front of
handler with dumbbell in mouth
9.
holding the dumbbell until
commanded to relinquish it while sitting in front of the handler after fetching
the dumbbell
10.
returning to heel position
11.
going to get dumbbell on
command (handler stays)
12.
waiting in heel position for
command to get dumbbell after it was thrown
So once each step has been learned
completely, the full behavior chain is put together. The final behavior chain
for this exercise from the last behavior to the first is: return to heel, sit
in front with dumbbell, returning with dumbbell, retrieving dumbbell, and
waiting in heel position for command to retrieve. The end reward is given after
the dog has returned to heel position.
The following is Karen Pryor’s Ten
Laws of Shaping from “Don’t Shoot the Dog!”. I refer
readers to this book for her thorough discussion of each point.
1.
Raise criteria in increments
small enough so that the subject always has a realistic chance for
reinforcement.
2.
Train one thing at a time;
don’t try to shape for two criteria simultaneously.
3.
Always put the current level of
response onto a variable schedule of reinforcement before adding or raising
criteria.
4.
When introducing a new
criterion, temporarily relax the old ones.
5.
Stay ahead of your subject:
Plan your shaping programs completely so that if the
subject makes sudden progress you are aware of what to reinforce next.
6.
Don’t change trainers in
midstream; you can have several trainers per trainee but stick to one shaper per
behavior.
7.
If one shaping procedure is not
eliciting progress, find another; there are as many ways to get behavior as
there are trainers to think them up.
8.
Don’t interrupt a training
session gratuitously; that constitutes a punishment.
9.
If behavior deteriorates, “go
back to kindergarten”; quickly review the whole shaping process with a series
of easy reinforcements.
10.
End each session on a high
note, if possible, but in any case quit while you’re ahead.
Teaching requires the educator to
first educate themselves. Only by a thorough understanding of the lesson or how
to teach what is desired can a teacher be effective. By understanding the “bare
bones” basics of learning theory, operant conditioning, positive reinforcement,
and shaping, using a conditioned reinforcer, such as
a clicker, changes from being a “trick that some trainers use” to a useful
tool. The final article in this series will be on “clicker” training.
Copyright © 2001 Ruth Kellogg. All rights
reserved. |