Join me at the intersection of computer science, economics, and psychology to consider the multilayered concept of regret. It’s a term that highlights the murky boundary between logic and emotion, drawing the interest of marketers and strategists of all stripes who deal in the art and science of persuasion. Believe me, you won’t want to skip this post!

In the world of AI, regret appears under the alias “counterfactual regret minimization” (CFR) which describes a class of machine learning algorithms that have earned fame for solving games with incomplete information, like poker. Cepheus, a poker-playing bot from the University of Alberta that wins at heads-up limit Texas hold ’em, demonstrates the power of CFR, which tracks regret for historical choices as new information becomes available and uses it to optimize strategy. (I would note that “counterfactual regret minimization” seems redundant: counterfactual thinking is the logical essence of regret, and minimizing it is the obvious reason we give it a name. But computer people will have their TLAs, and we should be grateful they bent to avoid confusion with “CRM.”)

Cepheus’ “Preflop” (first round) bet strategy. Image source: http://poker.srv.ualberta.ca/; Used with permission.

Cepheus’ “Preflop” (first round) bet strategy.
Image source: http://poker.srv.ualberta.ca/; Used with permission.

Right off the bat Cepheus and CFR show us that regret is not just an idle emotional affliction: it plays a key role in learning, in both humans and machines. While all machine learning programs use feedback to learn from mistakes, CFR is unique in its focus on applying new information to attach a regret value to each decision made in earlier training rounds. This gives it an edge in situations where information is hidden or wrong. Presumably, a similar selective advantage is behind the development of regret in humans.

Note that bluffing, which at first glance seems like a pure psychological element of gamesmanship, is in fact a mathematical element of game theory that can be optimized just like any other variable. This raises the issue of hidden objectives. Game theory suggests that at least the rules must be inviolable. But what if they’re not? Another lesson from Cepheus and CFR is that, in the real world – like, say, Las Vegas – the object of poker-playing machines is not to win the most number of games, but to collect the greatest amount money. This means often throwing games in order to manipulate one’s opponent into regretting they hadn’t bet more. Gregg Giuffria, owner of G2 Game Design which develops and licenses slot machines to casinos, told Maxim magazine about Cepheus, “Perfect is fine, but there’s no commercial viability for perfect. No one wants to play perfect.” So he designs machines to play a moody game to hook players and maximize revenue not just by using regret to perfect its own play, but by exploiting it in less dispassionate opponents.

(Coincidentally, Giuffria is also a musician and former keyboard player and vocalist with the glam band Angel. His composition Heartache features some heavy regret-laden lyrics:

Was too short and moved too fast, so far ahead I finished last
Now I long for the aging past to bring you home
Every breath, another song, just regrets I did you wrong
A fool’s apology and now you’re gone, and I’m on my own)

Game theory teaches us that any competitive situation that can be modeled with rules and states has one or more Nash Equilibria: optimal strategies that assume all players make the best possible moves based on the information available to them, including an assessment of competitors’ perfect strategies. But human psychology introduces a wildcard into the equation. For an AI, regret is just a number. For people, regret – and the anticipation of regret – can cause behavior to deviate from what an algorithm might predict are a player’s best choices.

This brings us to the realm of behavioral economics which seeks to understand how innate biases in human decision-making can be incorporated into economic models. Caspar Chorus, a professor at Delft University of Technology in the Netherlands, has explored regret using a model called “random regret minimization” (RRM – yes, behaviorists also like TLAs). RRM is offered as an alternative to “Random Utility Maximization” (RUM), which models the default assumption that consumers make choices by trying to maximize some sort of payoff (the “rational agent” hypothesis). Here’s an in-depth comparison of RUM and RRM. The idea behind RRM is that the desire to avoid negative emotions like regret plays a more dominant role in people’s choices than the desire to maximize some form of utility. This distinction may seem trivial – and studies suggest it often is – but the key insight is that there’s in fact a measurable difference between decisions made based on the utility value of a choice (RUM), and its anticipated relative value within a limited choice-set (RRM).

The difference can be summarized by what psychologists call “the compromise effect.” The compromise effect stipulates that, given a set of options ranging in value from low to high across a range of attributes, people tend to choose the most mediocre options, even if they don’t always maximize value. In fact, the greater the differences in value, the more likely a regret-based model will choose a middling option over the “best” option. This is because regret (as defined in these models) is minimized by the choice that’s closest to all the other choices, since it has the lowest cost of being wrong. The compromise effect is related to “negativity bias” and “loss aversion,” twin psychological observations that people are more strongly affected by bad experiences and emotions than good ones of a similar intensity. Evidence shows these effects are particularly strong when regret takes the form of negative social feedback. Given a limited number of choices, minimizing social regret might motivate people to choose the option least associated with negativity among their peer group, overriding other aspects of judgment.

This has some clear implications for marketers as they consider how to position their products relative to competitors. But notice the disconnect between the concept of regret we see in CFR and its interpretation in RRM. CFR treats regret as a form of accumulated knowledge – experienced regret (even if the subject is a bot), while RRM considers only the effects of anticipated regret on decision-making. Such a treatment is certainly a reflection of the limitations of economic sampling: it’s not really feasible to capture or model the history of comparable decisions a consumer might have made leading up to a real-world choice. But doesn’t dealing with broad averages obscure important differences in the way individuals in various situations incorporate regret in their decision-making? As we migrate from mass-market thinking to a world of scalable personal communication, focusing on segmented patterns trumps the all-encompassing approach.

Which brings us to psychology, which has produced mountains of literature on the topic of regret. To pick a thread, we can relate regret to another concept known as “locus of control,” conceived by Julian Rotter, an American psychologist known as the father of social learning theory. Locus of control theory considers differences in how strongly people believe they’re in control of their situation, as opposed to attributing outcomes to factors beyond their control, such as luck or other people’s decisions. Locus of control studies (this one has a good bibliography) suggest that individuals with more internal attribution have overall higher satisfaction and lower anxiety levels than externals. But this highlights a key difference in the way psychology and computer science treat regret. Focusing on regret’s emotional connotations, many psychologists conclude internals experience less regret, but a game theorist might reach the opposite conclusion: that the internals’ use of something like counterfactual regret minimization to modify their own behavior results in more successful adaptation. This seems to highlight a distinction between productive and unproductive notions of regret. But might internals also get it wrong – overestimate their degree of control in certain situations and fail to fold a losing hand? In incomplete information games, the party who knows the limits and levers of control wins.

Anticipating regret. (Image licensed from Adobe Stock)

Anticipating regret. Image licensed from Adobe Stock.

So where does this leave us?

First, we see that regret is an overloaded concept: its rational and emotional connotations can be in conflict. While regret is defined in negative emotional terms, its role in decision theory offers logical benefits.

As a rational matter, we can use regret in both machine and human learning situations where incomplete or unreliable information is the norm. By actively considering how past decisions might have changed had current knowledge been available (counterfactual reasoning), we can develop more effective ways to utilize the information we have, and to anticipate how others might react to the information we release.

We can also use the concept of regret to refine our understanding of motives that predict behavior. Minimizing regret is more than a means to an end: it’s the goal of many decisions, particularly where outcomes are volatile. We can use this to frame decisions in terms of alternatives rather than absolute merits, and recognize the latent tendency toward compromise. This also sheds some light on why people often surprise us when they say one thing and do another.

And as an emotional matter, we can recognize that people vary not so much in their capacity to experience regret, but in the way they respond to it. Individuals for whom regret engenders a feeling of helplessness are unlikely to modify their behavior…and you can recognize them as they repeat losing patterns. Individuals for whom regret inspires confidence that they’ve learned a valuable lesson may overcompensate as they adjust their strategy mix…and you can recognize them too. Tailoring communication based on inferred orientation toward regret may be an effective persuasion strategy.

Finally, regret and counterfactual thinking often seem to stand in opposition to the goal of living “in the moment.” Yet, real-time decisions depend crucially on one’s ability to distinguish between actions that affect outcomes and ones that don’t – things you can change and things you can’t. Counterfactual analysis is the way to develop the wisdom to know the difference. Serenity and courage are not included.

So, that’s our tour. Was it worth it?