Teaching our dogs through the use
of positive reinforcement is the most popular method of operant conditioning.
The biggest reason that it has become so widespread, I feel, is that it enables
a person to teach a dog what is desired in an easy way that creates pleasure
for both. Many of us do not enjoy causing discomfort in our companions or teaching
them in such a way that creates fear and insecurity.
People who use positive
reinforcement in their dog training and daily life must become aware of the
positive effects that are created. Who among us does not like receiving a “well
done”, a reward/surprise or the opportunity to enjoy a favorite activity when
we have done a task well. Our dogs are no different
from us. They, too, like the praise, rewards, activities, and recognition that
we strive for in our daily lives.
To review, in positive reinforcement
(of operant conditioning) a subject chooses to change its behavior to receive a
positive effect (such as praise, reward, favorite activity) in its environment.
A reinforcement occurs during or immediately after the
behavior is done. In positive reinforcement, the subject wants to achieve the reward and changes its behavior until it gets
it. With repeated attempts using similar behavior changes
which yields rewards, the behavior is changed. This is pure behavior
modification using incentives.
What can be used as
reinforcements? The answer is simple: whatever works! The reinforcement must be
species appropriate (eg. while a dolphin would work
happily for a fish, a horse wouldn’t) and be something that the subject enjoys.
Attempting to use food as an incentive for a ball-crazy dog is as useless as
trying to use a ball as an incentive for a strongly food orientated dog.
Changing the reinforcements — even during a training session — adds variety and
increases the attention and enjoyment factor of the subject.
The size of the reinforcement in a
training session is very important. The premise is to use as small an amount of
the motivator as possible to achieve the result of the subject (eg. dog) doing the behavior. Small amounts of the motivator
increase the subject’s interest in getting more of the motivator as well as
keeping its focus in the training session. When using food, use tiny amounts,
as the idea is to get as many behaviors as possible, not feed a meal during the
training session. If the dog becomes satiated, he won’t have the same desire to
work, When using a toy or activity as the motivation,
keep the sessions very brief otherwise the dog won’t be interested in working
at all.
There is one exception to using
small amounts of the motivator. Karen Pryor terms this exception a “jackpot”.
These are much larger than the normal motivator. (For instance, instead of just
giving a tiny piece of a hot dog, a whole hot dog is given.) These are big
surprises and should not be over-used to keep the surprise effect. Jackpots are
wonderful in marking a large breakthrough in a training session and are
extremely effective.
Timing of giving the
reinforcements is extremely important. Generally, if a new trainer is having
difficulty in teaching his dog, the problem is usually with the timing of the
reinforcement. Given too early, it doesn’t reinforce the actual behavior
desired but rather the preceding behavior. Too early a reinforcement is also
bribery, which is highly ineffective. If the reinforcement is given too late, the
opportunity for acknowledging the actual behavior desired has been lost. Again,
it is also ineffective. Correctly timed reinforcements do change behavior. Reinforcements communicate information to the subject about its
behavior and must be given during or immediately after the desired behavior is
achieved.
There are three types of schedules
that outline how reinforcements should be given to the subject.
The first schedule, constant
reinforcement, is to be used in the learning phase. This means the subject
(dog) receives the positive reinforcement (treat) each time it changes its
behavior on cue (lies down). The reinforcer acts as
information and the dog must learn that when the cue (in this case, “down”) is
given, he must change his behavior (i.e. lie down) in order to receive the
positive reinforcement (treat). It has been shown through studies
that while the subject continues to learn at a steady and moderate rate,
overall there is a gradual slowing of the subject’s response times with
brief and unpredictable pauses between the reward and the next behavior shown.
The second type of reinforcement
schedule is the fixed ratio schedule. There is a fixed
number of correct responses required by the subject before it receives a
reward. This could be as often as two responses to one reward or, more
infrequent, such as five responses to one reward. Studies have shown that the
subject responds at a high and steady rate except immediately after the
reinforcement before the next behavior is given. This is termed the
post-reinforcement pause. The more responses the subject must make before it is
rewarded, the longer the post- reinforcement pauses become.
The third schedule is the variable
reinforcement schedule. In this, there is no set ratio of responses required
for rewards. It allows great spontaneity for the trainer in rewarding the dog.
For example, such a schedule might be one response/one reward; three
responses/one reward; one response/one reward; five
responses/one reward; two responses/one reward.... The strength of the variable
ratio schedule is its unpredictability. Studies have shown that the subject
responds at a high steady rate with a minimal post-reinforcement pause. A
classic example is a slot machine.
Generally speaking, once a subject
(eg. dog) has learned a behavior, it should be put on
a variable ratio schedule. There is, however, one exception to this. Any time a
subject must make a choice or solve a puzzle (eg.
scent discrimination), the subject must be rewarded
every time (constant reinforcement schedule). This is the only exception and
should be adhered to without question.
Coupling the positive
reinforcement techniques with an attitude of kindness, love, and clarity of
purpose will give the trainer an obedient and educated dog. But this dog,
unlike those trained exclusively with negative reinforcement, will be an
individual who is confident, able to be flexible, has a developed sense of
humor, can think and reason, and has a desire to learn more.
When a trainer has developed a
clear and concise method of communicating what exactly is desired to his dog, then their education will take a quantum leap forward. Such
a method is clicker training, which when used with shaping techniques, enables
a trainer to teach his dog easily and without force. “Shaping” will be
discussed in the next article.
Copyright © 2001 Ruth
Kellogg. All rights reserved. |