full-disclosure - defeating voice captchas

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <43F192BE.7040505@gecadtech.com>
Date: Tue Feb 14 08:25:49 2006
From: stelian.ene at gecadtech.com (Stelian Ene)
Subject: defeating voice captchas

Gadi Evron wrote:

> Therefore, how many times does one have to refresh the page and listen
> to the Captcha to be able to simply learn to identify the Captcha by
> say, an MD5 hash of the audio for each letter?

That is just a bad implementation, when done well audio Captchas are
probably as secure as their visual counterparts.
"Done well" means that, besides the 10 digits (and/or 26 letters)
recorded by the sexy voice and replayed in a random order, the audio is
mixed with multiple sound sources, different for each generated Captcha.
For example, you can use a symphony(*), random white noise, the sound of
the street, or all of these, at a level of 3 or 6 dB above the voice.
The brain can easily distinguish the secret code from all the background
noise, but it's much more difficult for a computer.
While I'm not an audio expert either, I'm sure this problem is allot
harder than a simple MD5 - just look how bad state of the art voice
recognition software performs in almost ideal conditions, i.e. no
background noise etc.

(*) Of course, it's better to use sound sources that are hard to
identify, and are ideally not available to the attacker; else he could
obtain the same sounds and subtract them from the audio. I think some
random pitch shifting (tremolo) would help against this.