You Cannot Divorce Operant Conditioning From Emotions
I have quite often seen statements by trainers to the effect that positive punishment is not necessarily aversive, because in its “pure” understanding +P simply refers to adding a stimulus that decreases a behaviour and there is no implication that the stimulus has to be unpleasant. They would say that, in the same way, + R is not necessarily pleasant, but is simply the addition of any stimulus that increases behaviour.
However, what we need to remember is that when Skinner first described his understanding of operant conditioning, he saw behaviour as a simple INPUT-OUTPUT system. The mind of an animal was seen as a black box, which we had no ability to look into. Things have changed drastically since then and scientific and technological advances mean that the mind of an animal is no longer a “black box” – we do in fact understand quite a lot about how the brain works and we have been able gain insight into how the minds of animals work by comparing animal neurophysiology with human neurophysiology.
In fact, the mammalian brain (from rats to humans), is surprisingly similar across species: it contains most of the same structures and the same neurotransmitters. Similar experiences cause activation of the same circuits in the brain, inferring that we experience stimuli in similar ways. One of the most exciting areas of development has been in the field of Affective Neuroscience – the science of emotions. A pioneer in this field was Jaak Panksepp, whose work helped to map seven common “SYSTEMS” in the mammalian brain involved with the experience of primary emotions.
We now understand that voluntary behaviour is primarily driven by emotion. In short: we do stuff that makes us feel good and we avoid doing stuff that makes us feel bad. Punishment is not a neutral stimulus without any emotional connotation – punishment is AVERSIVE. That is why it works. In the same way reinforcement is not a neutral stimulus with no emotional attachment – reinforcement creates feelings of PLEASURE – that is why it works. You cannot divorce operant conditioning from emotionality.
To explain further: We all learned about homeostasis in school – the brain and body’s constant pull to keep all systems functioning at stable levels. Temperature regulation is a good example of homeostasis: If the brain perceives the body is too cold, you will be prompted to go indoors or put on a sweater. In the same way, if you are too hot, you will be prompted to remove clothing or turn up the aircon.
Emotionality works in a similar way – the brain works to maintain an emotional homeostasis. For example: If your family is away and you are feeling a bit lonely at home in the evening, that loneliness might prompt you to phone a friend or it could prompt you to go on Facebook or watch Netflix. We register discontent on some level and will do something to bring ourselves back to a happier place. On the other hand, we might find ourselves in an uncomfortable social situation where someone is behaving aggressively and causing of a scene. We would likely remove ourselves from that situation as quickly as possible, because we feel anxious and slightly fearful. We get home and feel massive relief to be away from something unpleasant.
These situations describe the 4 directions that basic emotions may move in:
1. Anxiety → fear → terror
2. Relief (from anxiety, fear or terror)
3. Discontent → misery → despair OR frustration → anger → rage
4. Pleasure → elation → ecstasy
These four directions of emotions are each brought on by one of the types of operant conditioning:
• Positive Punishment causes anxiety, fear or terror.
• Negative Reinforcement (which goes hand in hand with positive punishment) leads to feelings of relief
• Negative punishment leads to discontent, misery and despair or frustration, anger and rage (depending on personality type and other factors)
• Positive Reinforcement leads to pleasure etc.
Positive Punishment works, because the brain will urgently find a way to escape whatever is causing fear or anxiety and so the behaviour that preceded the unpleasant feelings is less likely to be repeated in the future - if it is correctly associated with the consequences, that is – more often the environment or something in the environment is associated with the unpleasant experience through classical conditioning and that stimulus (owner, trainer, training field, other dog etc), rather than the behaviour is avoided. Escape from and future avoidance of anxiety or fear-inducing stimuli results in feelings of relief i.e. Negative Reinforcement.
Positive Reinforcement works because it involves stimuli that create feelings of pleasure. If something feels good, we will repeat the behaviours that brought about those good feelings. Negative punishment involves the removal or denial of an expected reward. When a positive consequence we expected is not forthcoming, we either get frustrated and angry or we become despondent and depressed. (Picture sitting down to watch the latest episode of your favourite series, only to have the power go out at that exact moment!)
Modern, force-free trainers focus on using positive reinforcement: working with dogs in such a way that their behaviour is motivated by attaining feelings of pleasure, through seeking and acquiring food or toys. This type of activity (gaining food or a toy) results in a chemical reward cascade in the brain. Dopamine is an important neurotransmitter involved in reinforcement – besides making us feel good, it also has an effect on neurons that have just fired and essentially makes them “stronger” and more likely to fire again in the same situation. From this you should be able to see that using food in training is not just some bunny-hugging excuse to spoil animals – it has a scientific basis. Positive Reinforcement increases the likelihood of behaviour, because it has a direct effect on the very neurons responsible for carrying out that behaviour.
On the other hand, more traditional training relies on using positive punishment to suppress behaviour by causing anxiety or fear. Positive punishment always goes hand in hand with negative reinforcement, which is the dog learning how to escape or avoid behaviours that might have led to fear-inducing stimuli (including any physically or psychologically painful experience). Negative reinforcement is associated with strong feelings of relief- again dopamine has a primary role in this process as it not only causes that flood or relief that we feel when we escape danger, but also acts on the neurons that have just fired in order to strengthen them, so that they are more likely to be activated in that same situation in the future.
It is really important to understand that positive punishment itself does not teach any specific behaviour. Positive punishment triggers the survival system (your Freeze, Fiddle About, Fight or Flight system). It causes general inhibition of ALL behaviour and the activation of species-specific survival responses. Punishment and the associated negative emotional state involve stress hormones such as cortisol and adrenaline. When punishment is a daily occurrence, these hormones remain at high levels, creating chronic stress which leads to illness and hampers learning. While negative reinforcement does also involve dopamine and the strengthening of particular neurons and therefore behaviour patterns, negative reinforcement depends on the prior application of punishment and is therefore associated with the same emotional and physiological fallout. Furthermore, negative reinforcement often reinforces a simple survival response (such as freeze or flight) rather than the specific behaviour the trainer was hoping for.
What about negative punishment? Well, every time we deny an expected reward, that is negative punishment. Positive reinforcement-based training therefore invariably involves the deliberate or unconscious use of degrees of negative punishment (unless we reinforce absolutely everything the dog ever does when working with us). Negative punishment that results in only mild frustration can actually motivate a dog to try harder – that is how behavioural shaping works – we start reinforcing for something small and gradually raise our expectations for what we are willing to reinforce until the dog tries harder and offers us something more. So very mild frustration is not a bad thing – it is what motivates all sorts of goal-oriented behaviours like finding food and social contact. However, it is something we need to be careful with: if we deny a reward too often, we will end up with a dog that becomes frustrated to a point of giving up completely (they become demotivated and depressed by training) or angry! We also need to be aware that dogs at different ages, of different types and with varying life-experience have different levels of frustration tolerance and we must adjust our training and expectations accordingly. Generally, we should be aiming to set dogs up to succeed as much as possible, so that we have something to reinforce and the behaviour can strengthen, rather than setting them up to fail.
I hope that it is clear from all of this that you really cannot divorce training or the use of operant conditioning from emotions. Emotions drive behaviour – they are the ultimate reinforcers and punishers. Surely our dogs deserve training methods that make them feel good?