lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Date: Tue, 19 Jun 2007 10:30:30 -0400 (EDT)
From: "Steven Adair" <steven@...urityzone.org>
To: "J. Oquendo" <sil@...iltrated.net>
Cc: full-disclosure@...ts.grok.org.uk
Subject: Re: Squashing supposed hacker profiling

Amazing, you were able to find multiple instances where a script-based
gender guesser was wrong?  This is more profound than the initial research
itself.  I suppose I could post a series of 10 writings where it was
correct, but what would that prove?  Did you try reading this from the
same page:

-----

A few quick notes:

    * The system generates a simple estimate (profiling). While Gender
Guesser may be 60% - 70% accurate, it is not 100% accurate. This is
better than random guessing (50%), but should not be interpreted as
"fact". In particular, men should not be offended if it says you write
like a girl.

    * People write differently in different forums. For example, a single
writing sample may appear MALE for informal writing but test as FEMALE
for formal writing. Be sure to interpret the results based on the
appropriate writing style. (These notes, for example, are more
informal/blog than formal/non-fiction.)

    * Many factors can impact the interpretation from any single person's
writing. The content, knowledge of the material, age of the author,
nationality, experience, occupation, and education level can all
impact writing styles. For example, a woman who has spent 20 years
working in a male-dominated field may write like her co-workers.
Similarly, professional female writers (and experienced hobbyists)
frequently use male writing styles. Gender Guesser does not take any
of these factors into account.

    * Email can blur the lines between formal and informal writing styles.
An informal email from a manager may have traces of formality, and a
formal email from a 12-year-old is likely to be informal compared to a
letter from a 40-year-old. Do not be surprised if email messages sent
to public forums test incorrectly -- when writing for an audience,
people commonly use informal words, phrases, and slang within a formal
writing style.

    * Quotations, block quotes, and included text usually carries the
gender from the initial author. Be sure to remove quoted text from any
pasted content. Also, significant changes from a copy-editor can
result in a different gender analysis. (A male editor may make a
female author's news article appear MALE or as a Weak MALE.)

    * Lyrics, lists, poems, and prose are special writing styles. This
tool is unlikely to classify these texts correctly.

    * The system needs a paragraph or two of text in order to observe word
repetition. A good sample should have 300 words or more. Fewer words
can lead to more variation in accuracy, and a single sentence is
unlikely to generate an accurate result. Pasting the same text
multiple times will not change the results!

    * People tend to write with consistent styles. If the system
misclassifies a particular author, then other writings by the same
author will likely be misclassify the same way.

    * And most importantly: This is an ESTIMATE. Please do not email me
about instances where it made the wrong determination. (I've seen it
generate incorrect results lots of times already.)

----

I can't tell if you're trolling or you have actually taken the bait.  You
do realize the person that you were responding to in earlier posts is not
actually Neal Krawetz, right?


> All female authors...  Your so called gender guessing mechanism is
> flawed either way you want to cut it. You could try fuzzy math based on
> theories to profile anyone on this list, but unless you have feasible
> and PROVEN without reasonable doubt, its all a guessing game bottom
> line. Anyhow back to security, sociolinguistics is not meant for this
> list.
>
> According to Dr. Krawetz's Gender Guesser...
> (http://www.hackerfactor.com/GenderGuesser.html#Analyze)
> http://girlygeekdom.blogspot.com/
> Genre: Informal
>   Female = 104
>   Male   = 602
>   Difference = 498; 85.26%
>   Verdict: MALE
> Genre: Formal
>   Female = 116
>   Male   = 239
>   Difference = 123; 67.32%
>   Verdict: MALE
>
> REALITY: WRONG
>
> http://www.darkreading.com/blog.asp?blog_sectionid=342&WT.svl=blogger1_5
> Genre: Informal
>   Female = 442
>   Male   = 555
>   Difference = 113; 55.66%
>   Verdict: Weak MALE
> Genre: Formal
>   Female = 364
>   Male   = 570
>   Difference = 206; 61.02%
>   Verdict: MALE
>
> REALITY: WRONG
>
> http://invisiblethings.org/papers/joanna-talk_description-CCC04.txt
> Genre: Informal
>   Female = 218v
>   Male   = 1186
>   Difference = 968; 84.47%
>   Verdict: MALE
> Genre: Formal
>   Female = 414
>   Male   = 576
>   Difference = 162; 58.18%
>   Verdict: Weak MALE
>
> REALITY: WRONG
>
> http://www.techsploitation.com/2007/05/31/what-the-hell-was-i-thinking-about-green-libertarians/
> (text by Sue Lange)
> Genre: Informal
>   Female = 210
>   Male   = 481
>   Difference = 271; 69.6%
>   Verdict: MALE
> Genre: Formal
>   Female = 260
>   Male   = 408
>   Difference = 148; 61.07%
>   Verdict: MALE
>
> REALITY: WRONG
>
> http://thelizardqueen.wordpress.com/2005/06/08/a-thoroughly-and-utterly-girly-blog-post-sorry-4/
> Genre: Informal
>   Female = 415
>   Male   = 559
>   Difference = 144; 57.39%
>   Verdict: Weak MALE
> Genre: Formal
>   Female = 180
>   Male   = 312
>   Difference = 132; 63.41%
>   Verdict: MALE
>
> REALITY: WRONG
>
>
> To be fair I had to go to the most feminine place I could think of, even
> then it was iffy.
>
> http://groups.ivillage.com/motherdaughter/
> Genre: Informal
>   Female = 226
>   Male   = 337
>   Difference = 111; 59.85%
>   Verdict: Weak MALE
> Genre: Formal
>   Female = 326
>   Male   = 314
>   Difference = -12; 49.06%
>   Verdict: Weak FEMALE
>
> REALITY: MAYBE THE AUTHOR HERE WAS FLAMINGLY GAY
>
> --
> ====================================================
> J. Oquendo
> http://pgp.mit.edu:11371/pks/lookup?op=get&search=0x1383A743
> echo infiltrated.net|sed 's/^/sil@/g'
>
> "Wise men talk because they have something to say;
> fools, because they have to say something." -- Plato
>
>
> _______________________________________________
> Full-Disclosure - We believe in it.
> Charter: http://lists.grok.org.uk/full-disclosure-charter.html
> Hosted and sponsored by Secunia - http://secunia.com/


_______________________________________________
Full-Disclosure - We believe in it.
Charter: http://lists.grok.org.uk/full-disclosure-charter.html
Hosted and sponsored by Secunia - http://secunia.com/

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ