George Stanley Reynolds' 1968 book, "A Primer of Operant Conditioning," offers a comprehensive and succinct guide for individuals seeking a deeper understanding of behavioral principles. Those looking for a more detailed explanation will find it to be an excellent reference.
The Intro
Before we continue into some more nuanced, and rather sensitive topics, we need to talk about some basic principles. If you recall from our previous posts, our behavior is shaped by our environment. That is, we do what we do because of our history with the environment (i.e., the frequency of the behavior is modified by the consequences of the behavior). Our behavior is constantly being reinforced or punished on some fixed or variable schedule. We are constantly discriminating between stimuli, allocating our behavior toward stimuli that have produced richer reinforcement schedules, and avoiding stimuli that have signaled punishment. We should clarify that these terms mean something a little different (and precise) when speaking about them with respect to behavior analysis relative to how we use them colloquially. Our goal in reviewing these terms is to understand how these principles can be used to implement change and promote prosocial behavior.
The Terms
Reinforcement: A neurobehavioral process whereby a stimulus change contingent on the occurrence of a behavior increases the future likelihood of the behavior occurring. These stimuli are called reinforcers.
Positive Reinforcers: Stimuli whose onset or presentation increases the likelihood of future occurrence.
Negative Reinforcers: Stimuli whose removal or cessation increases the likelihood of the behavior recurring.
Punishment: A neurobehavioral process whereby a stimulus change contingent on the occurrence of a behavior decreases the future likelihood of the behavior occurring. These stimuli are called punishers.
Positive Punishers: Stimuli whose onset or presentation decreases the likelihood of future occurrence.
Negative Punishers: Stimuli whose removal or cessation decreases the likelihood of the behavior recurring.
Fixed-Ratio Schedules: A stimulus (i.e., reinforcer, punisher) is delivered after a behavior is emitted X times by an organism. X refers to a fixed number.
Variable-Ratio Schedules: A stimulus (i.e., reinforcer, punisher) is delivered after a behavior is emitted an average of X times by an organism. X refers to a fixed number.
Fixed-Interval Schedules: After X time elapses, a stimulus (i.e., reinforcer, punisher) is delivered contingent on a behavior occurring. X refers to a fixed number.
Variable-Interval Schedules: After an average of X time elapses, a stimulus (i.e., reinforcer, punisher) is delivered contingent on a behavior occurring. X refers to a fixed number.
Extinction: Contingent on a behavior, the consequence is NOT provided. The frequency of the behavior is gradually diminished and eventually extinguished.
Stimulus Control: A stimulus that reliably occasions a particular class of behavior when it is present and does not occasion that behavior when it is absent.
Discrimination: Differential reinforcement of a response with respect to a property of a stimulus.
The Examples
Reinforcers are often considered “good” things, but we have to remember that “good” and “bad” are subjective. We each have our own preferences, whether that be food, music, or romantic interests. It is also important to consider our preferences do change. I prefer to play soccer over baseball. However, that might change if a) all my friends want to play baseball, b) I haven’t played baseball in a very long time but have played a lot of soccer recently, or c) I can’t find my soccer ball but have access to a baseball and bat.
In regards to what behavior is reinforced in our soccer example, I will focus on two. First, the behavior of scoring a goal is reinforced. When a soccer player is on the field, they engage in a chain of behaviors to score a goal. They need to navigate through the other players and avoid tripping or colliding with them. This behavior is not based on free will but on a learning history in which attending to the environment has been reinforced. For example, the player might have learned through consequences (reinforcers or punishers) that avoiding collisions will increase the likelihood of scoring a goal (the reinforcer) and avoiding aversive consequences such as pain or social interactions that might result from a collision. In the past, the player might have experienced pain or embarrassment from colliding with other players, which has punished the behavior that preceded it.
To continue the example, if the player does not know where the goal post is located, indiscriminate running around will eventually reinforce their goal-scoring behavior, but on a variable schedule. It might take them 30 seconds or up to 5 minutes to find the goal post. Searching will eventually be reinforced, but that is not very efficient. Using environmental cues (i.e., looking for the net or asking their teammate) is an example of discriminating between stimuli that signal reinforcement from those that signal punishment. For example, if the player shoots the ball into the wrong net, chances are they will not do that in the future. Their goal-scoring in the presence of the correct net will be reinforced, but not in the presence of the incorrect net. Stimulus control has been established when the player sees the correct goal post and begins running toward it but does not run toward the incorrect one.
These principles still apply to more interesting and serious behaviors. For example, consider what we might call “bad” behavior. We are always told not to steal, that it’s bad or immoral. (We will come back to morality in future posts. But here is a teaser - morality is a hypothetical construct. No behavior is innately moral or immoral). We’re not here to condone or explain away reasons why someone might steal; we are here to explain variables that might explain why stealing occurs. In order to do that, we need to examine the individual’s environment. When you ask why I stole the beer, one reason may be that I did not have a beer, but I wanted a beer. Using words like “bad” or “immoral” does nothing to reduce the behavior of stealing. That’s why the science of behavior is so valuable, it removes circular reasoning like this:
Why did he steal? Because he’s a bad person?
How do you know he’s a bad person? Because he steals.
We get nowhere with this type of explanation. It becomes easier to understand (and reduce) if we take a step back and try to examine the environmental variables that contribute to stealing. One contributing factor could be I don’t have more appropriate alternatives to acquire the beer (i.e., money). Another valid question might be why steal if you don’t have the money? Well, similar to our example of Homer learning to play baseball, we should examine the terminal goal (or reinforcer) in the chain. Homer yanked the string attached to the bat signaling…
…the ball should fly into the field signaling…
…he should run to first base signaling…
…he should run to second base signaling…
…he should run to third base signaling…
…he should run home signaling…
…he can FINALLY get his food from the hidden food dispenser off the field.
You’ll notice he did NOT hit engage in any of these behaviors for the sake of the game. The terminal reinforcer was food. In our stealing example, we have to look at the terminal reinforcer. I stole beer because I didn’t have it and I wanted it. But why did I want it? It could be that I think they are grossly overcharging for the beer (in this case, stealing would occur in the presence of the high price point but not with a very low price point). It might be that it’s the last one in the store and I forgot my wallet at home. Retrieving my wallet means someone possibly buys it while I’m gone. It might also mean years of life poking me in the side asking “had enough yet?” and I don’t have money to escape these aversive stimuli in my environment in a more healthy or meaningful way. However, I can drink beer which, at least temporarily, alleviates the aversive circumstances in my life.
One final component to discuss is the likelihood that behavior will contact reinforcement or punishment. It can happen on a consistent or inconsistent schedule. One reason it is important to understand this is that behaviors will occur at different rates depending on how often those consequences occur. Under ideal conditions, every time you call your mom, she answers. This is what we call a fixed-ratio (FR) schedule (specifically, FR 1, because every response is reinforced). If you dial your dad’s phone number but he only answers, on average, every 5 times, we call this a variable-ratio VR) schedule (specifically, VR 5, because on average, every 5 responses are reinforced). Just examining these two examples, what happens if both your mom and dad decide to stop answering your call (i.e., your phone calling contact extinction)? Well, it depends on whom you attempt to call. If your mom does not answer, you might a) get worried that something is wrong, b) try to call others who can check on her, and c) repeatedly call her in a panic! Now, this might also happen with your dad, but it would take longer to get to that point. Eventually, maybe after 10 or 12 calls that are not reinforced, you might start asking if anyone has heard from him.
Figure 6.5 in Reynold’s A Primer of Operant Conditioning (1968) is an excellent demonstration of the different response patters commonly observed under ratio and interval schedules during reinforcement and extinction.
Interval schedules, however, produce very different levels of responding. The reason is that reinforcer delivery is not dependent on rate of responding. For example, if you are baking a cake and the recipe calls for it to be in the oven for 30 minutes at 350 degrees Fahrenheit, checking on the cake at minutes 3, 7, or 14 will not be reinforced (that is, finding a finished cake). However, if you check it very soon after minute 30, you will find a finished cake. This is called a fixed-interval (FI) schedule (specifically, FI 30). It is also possible to find a burned cake at minute 45 which would punish checking on the cake long after the 30 minutes have elapsed. If it is a recipe calling for 30 to 40 minutes at 300 degrees Fahrenheit (and on average, the cake is done by minute 35), checking on the cake will be reinforced on a variable-interval (VI) schedule (specifically, VI 35).
The Conclusion
So to wrap this up in a neat package, here is a summary: Understanding the basic principles of behavior is essential to promote prosocial behavior and implementing change. Our behavior is shaped by our environment, and we constantly discriminate between stimuli and allocate our behavior toward stimuli that produce richer reinforcement schedules. Reinforcement and punishment are stimulus changes that increase or decrease the future likelihood of the behavior occurring, respectively. Fixed-ratio, variable-ratio, fixed-interval, and variable-interval schedules refer to the timing of the reinforcing stimulus delivery. Stimulus control refers to a stimulus that reliably occasions behavior, while discrimination is differential reinforcement of a response with respect to a property of a stimulus. These principles can be applied to all behaviors, including those that we might consider “bad” or “immoral,” and understanding them can lead to effective interventions and behavior change.
Pass it on and see you next week.
Mike, it is not moral relativism. The science of behavior is not incompatible with having a soul and believing in good or bad. It just explains the way that people learn to be good or bad by their experiences with the environment and the synaptic self (biological) that is wired by those experiences. The practical application is that we can create an environment where prosocial behavior is the most dominant. The principles of behavior apply whether you know it or not. Usually we leave it up to chance - then we get a lot of bad behavior happening because of the experiences those individuals have had in their developmental environment that makes bad behavior to occur at a higher frequency. Imagine if we can create an environment where acting pro-socially and acting to benefit the culture rewarding to the individual therefore those behaviors occur more frequently. Where we are today is a result of the unguided cultural evolutionary process that now rewards many individuals for their bad behavior and as a society we are not reinforcing rule following or the behaviors associated with the cement of society - 1. Stable predictable patterns of behavior. 2. Cooperative behavior. That builds trust and reciprocity - what glues our behavior in our society/culture. Ask - why are we becoming unglued?
I used it with the undergraduate class for years.
Hank