Tue 02 Dec 2008

RSS Feed

Edited by Paul Hales

Published by Incisive Media Investments Ltd.

Terms and Conditions of use.

To advertise in Europe e-mail here

To advertise in Asia email here.

To advertise in North America email here.

Join the INQbot Mail List for a weekly guide to our news stories:

Subscribe

Captchas are easily hackable

Defcon 2008 Annoys users, not bots

THERE WERE TWO talks on Captchas at Defcon 16, and they both said the same thing, Captchas are pointless and stupid. This was backed up with a lot of science and code, but the end result is the same, if you have a clue and some time, you can break them on a whim.

The easiest way to crack them is to make a table of what the Captchas are, basically get one of every possibility and look it up. Impossible? Well, PHP-Nuke uses a six-digit numerical Captcha without any injection of randomness, meaning 10^6 possibilities. since you can refresh the page and get another one, you just hit refresh and scrape the pics. Make a checksum of the result, and voila, this Captcha is a simple database lookup away.

Picking on an open source project is easy, banks would never do that, right? I mean, Paypal would never..... they use a 5 digit alphanumeric Captcha meaning 36^5 possibilities. You can refresh it too for another try. Damn secure there guys, a decent botnet can rip this apart in a few hours. Luckily they have all sorts of disclaimers in their EULA, so their problems are likely your problems.

Audio Captchas are also vulnerable to what is called Hamming distances. Using tools made for bioinformatics, you can simply look at the .WAVs and pick it appart. Livingston distances, basically fuzzy Hamming, make things a lot more accurate.

But that is the obvious ways, what if they Captcha makers have a clue, not that any seem to at the moment. Some use a set URL for a set Captcha, so you can just look at the return URL in the code to see the result. Duh.

Others have miserable random number generators. With the appropriate mathematical knowledge, you can reconstruct a 5 digit Rand() based Captcha in 12-13 samples, then it is downhill from there. Random() is a little better, but not enough to make things hard for hackers.

The next way to do things is to do use off the shelf OCR software, and that makes life very easy. The hard part is to clean up the image and train the OCR, something that can be done quite easily with simple Gimp filters. If the Captcha puts in thin lines or dots, a blur filter removes them easily. Edge finding also does wonders, as does contrast controls. Grids are fall prey to algorithmic assault, and in general, a little thinking will break most of them.

The last bastion is called cultural knowledge based Captchas, basically a 'hot or not' as a security scheme. If you are presented with four pictures, three kittens and one puppy, being able to pick out the puppy is hard to do via image recognition. Brute forcing is easy here, it is very hard to have an image library big enough to keep a botnet from downloading them all. Any image library you can buy, so can the bad guys.

In the end, Captchas are easy to break. It is just a matter of how hard it is to do, and percentages are everything. If you have a 1 in 10^5 chance of guessing the result, even a botnet isn't going to have much success pounding away. Any technique that can up things to only a 20 per cent success rate will sound stupid to a person, but 10K machines will eat that for lunch. Game over for Captchas. µ

Comments

Easy to prevent?

Simple you lock the number of refreshes of a captcha per IP surely?

So they have a botnet and you have locked refreshes of the captcha per IP to say 10? You can quickly get a map of the botnet's addresses because you'll have lots of captcha requests that fail 10 times.

With such a list you could then block parts of a botnet by IP release the block every 7 days just in case the zombies release their IP to get another one (something that requires losing contact zombie until it reconnects to the internet), some zombies will be connected to cable infrastructures which require a lot more work to release the IP, meaning that the zombie pc will be blocked, as in the example, for 7 days.
posted by : Two00lbwaster, 10 August 2008

Fast question

So why your captcha is still here
posted by : Zodiac, 10 August 2008

Are you a programmer

Man you are just saying rubish, you are just a average jornalist. Don't make yourself even more stupid that what you are alread.
posted by : Marco, 10 August 2008

What about ReCaptcha?

Surely with ReCaptcha then the easily OCR'd images would by definition not exist unless the baddies are using fancier algorithms than the people digitising the books (which is entirely feasible).

Hmm, TheINQ uses a Captcha for this comments page.
posted by : Duncan, 10 August 2008

Captchad

The comments are easily hackable.
posted by : Tweeker, 10 August 2008

Not even trying, are we.

yliogjcally not. Why would anyone bother?
posted by : Tweeker, 11 August 2008

O RLY?

Irony...

http://img239.imageshack.us/my.php?image=easilyhackablecf7.jpg
posted by : Kushan, 11 August 2008

And yet...

I still had to use one to comment on this article, which has turned this comment into a kind of self-fulfilling thingamabob.
posted by : Lindsay, 11 August 2008

So why do you still use them?

Go on. Get rid of the captcha on the comments if they don't work :)
posted by : Dick, 11 August 2008

Captchas are yesterday's

Enter the Voight-Kampff test.
(http://en.wikipedia.org/wiki/Voight-Kampff_machine)

(lol, a captcha is needed to submit a comment)
posted by : Boomerang, 11 August 2008

Old News

We figured out how easily CAPTCHA was broken last year for two reasons.

1). Anyone who's ever used Yahoo chat knows there CAPTCHA system, designed to keep out porn bots worked maybe a few weeks.

2). Our website commenting system, utilizing what was considered a secure CAPTCHA system became innundated with viagra bots until we implemented additional measures (Hashcash).

Here at the INQ your comment systems CAPTCHA is as weak as everyone elses. I hope you're using Hashcash or some other algorithmically taxing safety net. Pretty simple to implement too. www.hashcash.org

Scott
posted by : Scott Piercy, 11 August 2008

So anyway...

That captcha is useless has been known for a while. What's needed is a semantic replacement for it.

So, you face a form, with the question "which of the following is not a part of a healthy diet

Orange,
Apple,
Poop"

with a request to enter the correct answer in a text box. Vary the question, make it more complex even, the point is to make the test semantic. Computers are shite at semantics. You can't brute-force inherent meaning.
posted by : Graham Dawson, 11 August 2008

It's in the Database, Jim

re: semantics comment.

Whilst you'd like your question to be unanswerable by a computer, the second you re-use any question, it's simply a case of storing questions and answers in a database.

You would have to permanently assign a member of staff to manually generate new unique questions.

You couldn't let your users generate the questions either, since you cannot assume zero malicious users.
posted by : somesoandso, 12 August 2008

Stupid!

Oh yes?is it as easy as you say?
So Why Yahoo or Google are Still alive?
posted by : MAB, 27 August 2008

Captcha is a joke

If a human can read, a program can r ead too.
posted by : none, 11 September 2008
IThound
Search for solutions, reports & analysis

Newsletter signup



 

Top INQ Stories