Amongst the varied and interesting topics in the field of digital fraud detection and identity proofing that I discuss with my clients is the topic of bot detection and mitigation. And when on this topic, I can usually guarantee a lively debate when we discuss CAPTCHAs. I find that the clients tend to divide quite cleanly into two camps of those being for or against using CAPTCHAs. And in the latter camp, they tend to have a rather visceral reaction to the notion of using a CAPTCHA and are militantly against it.
I find the topic of CAPTCHAs supremely fascinating. Let’s take a quick detour into the stages of their evolution (if you don’t want the history lesson, just skip to the next section).
The evolution of CAPTCHA
- Stage1. CAPTCHAs first started to be used in the early 1990s, most prominently by search engine Alta Vista that had a huge spam problem – they introduced the concept of the warped letters.
- Stage 2. After this, reCAPTCHA was created, which involved a pair of words – and was the start of using this process to solve AI problems, in this case book scanning. The first word was known to the system, the second word was unidentified to the system and was part of a book scan. Once a number of people had typed in the same answer for the second word, it would be resolved for the book scan. Google acquired reCAPTCHA.
- Stage 3. ReCAPTCHA 2 then came along, which involves ticking a box to say you’re not a robot, and Google doing checks on your user behaviour, and then a potential secondary check, the now infamous selection of images. This is another good example of CAPTCHAs being used to solve machine vision problems, in this case in the traffic environment, one would assume for self-driving cars. Given the focus in academia and commercially on this particular machine vision problem, bad actors have been able to leverage lots of tools to develop systems to defeat them. In fact, given that Google Cloud sells machine-learning systems, it’s entirely likely that some of Google’s servers are creating CAPTCHAs, and others are breaking them.
- Stage 4. Google has rolled out reCAPTCHA v3 which is nothing more than the checkbox, and appears to use the same kind of tools that most bot detection vendors do in terms of user behaviour analysis.
- Stage 5. But Google is not the only player in town. CAPTCHAs continue to evolve, with many vendors investing heavily in their own version. Some claim to have exceptionally high solve rates for humans and low solve rates for bots. The approach that some have taken is to focus on using machine vision challenges that are easy for humans to solve and also have no commercial applications, so that unlike the self-driving car applications, bad actors can’t leverage academic, commercial and open source research to build tools to solve these CAPTCHAs.
The case for and against
So back to the point in hand – why do some businesses like to use CAPTCHAs? Well, the logic is straightforward. If your bot detection solution believes the user is a bot, then you can block that user. But no solution is perfect, what if the user is not a bot? Then you’ve just blocked a good user. And so the CAPTCHA represents an opportunity, it represents hope, that if this is actually a good user then they can prove it by solving the CAPTCHA and continue on their way.
So why do some businesses hate CAPTCHAs? Well, the solve-rate for (good) humans on CAPTCHAs can sometimes be quite low, resulting in good users not being to proceed. Plus, for those good users who do solve the CAPTCHA, it’s often an annoying addition to the UX. And the solve rates for bad actors on CAPTCHAs can sometimes be quite high………….although this is a nuanced point since in some cases yes perhaps bots have been trained to solve the CAPTCHAs, but in others cases the bots hand over the session to a human to solve the CAPTCHA – there is a thriving industry of human-powered CAPTCHA solving services – typically people in low-income environments being paid a pittance per CAPTCHA, solving thousands of them daily.
So should you use a CAPTCHA or not?
I think so. Just don’t use the ‘pick a street sign from this matrix of images’ Google version of a CAPTCHA. There are many far more evolved CAPTCHAs available today from the likes of Arkose Labs, GeeTest or PerimeterX that have their own approaches and nuances but consistently do a better job than the dreaded matrix of traffic images. Given the pressure to reduce false positives on most digital commerce businesses, giving users a chance to prove that they’re human and not just blocking them is worth exploring. In a world where bad actors use humans to augment bots, though, you need a CAPTCHA that’s smart enough to detect when the humans solving it are just a little too fast – perhaps a sign that they spend their day solving these things over and over. When that’s detected, the CAPTCHA should dynamically be made harder to discourage such activity and render it economically nonviable. Using a CAPTCHA also forces interaction with the user in a controlled manner, allowing more telemetry to be obtained about the user and their level of humanity.
Just don’t use CAPTCHA as a default for all sessions. Rely on your bot detection vendor to spot and block most of the bot traffic, and just deploy CAPTCHAs in those genuine grey-area cases where you’re not sure of humanity – I’d expect that to be less than 5% of all sessions for sure.
And perform A/B testing………split your traffic and use a CAPTCHA on one segment only and compare the results – make decisions based on actual data and metrics, not preconceptions based on outdated approaches such as the traffic image matrix.
I’m always interested to hear about experiences of implementing different CAPTCHAs – reach out and let me know what you experienced!
As an interesting final aside, Amazon filed a patent in 2017 for a new type of CAPTCHA that is easy for machines to solve, but presents a visual challenge that humans would typically get wrong – and thus the process is subverted – human fallibility may in fact be the future when it comes to defeating bots……..
The Gartner Blog Network provides an opportunity for Gartner analysts to test ideas and move research forward. Because the content posted by Gartner analysts on this site does not undergo our standard editorial review, all comments or opinions expressed hereunder are those of the individual contributors and do not represent the views of Gartner, Inc. or its management.
Comments are closed
6 Comments
Can’t agree more.
Thank you for sharing. The article provides excellent background information on the evolution of CAPTCHA, and offers an incisive analysis of bot mitigation. Well, bot mitigation is far more than just CAPTCHA service. An all-around bot detection & fraud prevention solution is what businesses need today.
Hi Akif, thanks for such a thoughtful article. I think understanding the history is important to recognize how we got to the place we are with CAPTCHA and bot mitigation strategies.
I also think it’s important to state, however, that for motivated attackers, CAPTCHAs represent a speed bump at most. The people who stand to do the most damage to organizations are also the ones who skip right by these kinds of controls. I’ve even worked as a CAPTCHA solver myself to test the capabilities of the human solver systems, and it’s a strange world. The attacker community has created a robust, distributed system of human labor that takes the Turing test part right out of the picture. This kind of arms race, in which the security industry is constantly trying to adapt to the latest attacker tactics, sometimes leads to dead ends. When I read about a system that is designed to intentionally make people fail at a task, just to prove they’re a human, it makes me think we’ve hit a dead end with CAPTCHAs. There are other ways to mitigate bots that don’t shift the burden to end users—I think that, as an industry, we need to start looking harder at those.
Super
God is great
In order to understand whether it’s needed or not, it’s better to check the history of captcha https://utopia.fans/security/the-history-of-captcha-how-it-appeared-and-changed/ and why it was creted on the whole. But from my own experience, I can say that it’s getting really annoying sometimes. Hidden captcha is much better as it will appear quite rarely.