phc-discussions - Re: [PHC] data-dependent branching (Re: [PHC] A little nit which bothers me...)

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite for Android: free password hash cracker in your pocket

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <86090.1396448717@critter.freebsd.dk>
Date: Wed, 02 Apr 2014 14:25:17 +0000
From: "Poul-Henning Kamp" <phk@....freebsd.dk>
To: discussions@...sword-hashing.net, Solar Designer <solar@...nwall.com>
cc: Brandon Enright <bmenrigh@...ndonenright.net>
Subject: Re: [PHC] data-dependent branching (Re: [PHC] A little nit which bothers me...)

In message <20140402133904.GA26931@...nwall.com>, Solar Designer writes:

>> >Data-dependent branching has not been used as much yet, [...]
>> 
>> I never saw the point really.
>
>I thought you advocated data-dependent branching or something like it
>(switching between multiple crypto primitives) in your md5crypt EOL
>announcement a couple of years ago. 

I wasn't really trying to be very detailed about such matters in that
missive (http://phk.freebsd.dk/sagas/md5crypt_eol.html) and I admit
that the bit you are (probably) refering to is more ambiguous than
it needed to be:

	The algorithm should be based on repeated data-dependent
	iterations of several different complex one-way hash functions
	(MD5, SHA1, SHA2, BLOWFISH, you name it, use them all) in
	order to "soak up area" in hardware based attack implementations.

>There is a point, and it is in causing attackers with massively parallel
>hardware to waste some of their resources on some of:

I agree, but I tend to find that there is a much better solution
to most if not all of those troubles:  Make a strictly serial
algorithm.

>> The way to solve that is to only condition data-dependent
>> branching on "dead-end bits" which don't lead back to our entropy:
>> 
>> 	if (CONDITION(HASH(entropy || "Lorem ipsum dolor sit..."))) {
>> 		entropy = FUNC1(entropy);
>> 		discard = FUNC2(entropy);
>> 	} else {
>> 		discard = FUNC1(entropy);
>> 		entropy = FUNC2(entropy);
>> 	}
>
>How does this not lead back to our entropy?

The only thing the result of the HASH is used for is to pick the
branch, so if HASH() is something decent (MD5+), even if you can
tell which branch was taken, no bits of the entropy that is passed
to either FUNC1 or FUNC2 will be revealed.

If you don't run the entropy through that HASH, determining
which branch tells you something about the bits of that entropy.

(Forget the "Lorem ipsum" (google it!), that was just meant to
indicate that you probably want to stuff more than the entropy
into the hash function.)

>> We cannot lower our standards and allow timing attacks against
>> "dead-end" bits:
>> 
>> 	if (CONDITION(HASH(entropy || "Lorem ipsum dolor sit..."))) {
>> 		entropy = FUNC1(entropy);
>> 	} else {
>> 		entropy = FUNC2(entropy);
>> 	}
>> 
>> That would give a 50% discount on a brute-force attack with timing
>> information.
>
>I don't understand what you mean here.

If the attacker did a timing recording, and found that his victims
password takes the first branch, he can abandon any brute force
attemt if an attempt takes the second branch, eliminating 50% of
the subsequent work.

>> In brute-force attack without timing information, the "discard"
>> bits could obviously be optimised out, that's also no good, so lets
>> not discard those bits anyway:
>> 
>> 	if (CONDITION(HASH(entropy || "Lorem ipsum dolor sit..."))) {
>> 		entropy[first_half] = FUNC1(entropy);
>> 		entropy[second_half] = FUNC2(entropy);
>> 	} else {
>> 		entropy[second_half] = FUNC1(entropy);
>> 		entropy[first_half] = FUNC2(entropy);
>> 	}
>> 
>> And then it follows trivially that this is the same as:
>> 
>> 	i = CONDITION(HASH(entropy || "Lorem ipsum dolor sit..."))
>> 	entropy = FUNC12(entropy);
>> 	if (i)
>> 		swap_first_and_second_half(entropy)
>> 
>> Ohh well...
>
>Looks like you were going after a holy grail while writing that e-mail,
>but didn't quite reach it. ;-)

No, I think I reached it, but I may (again) have been to terse in
my explanations.

The problem with data dependent branches is that they reduce to
not being branches after all, unless you are willing to compromise
on the resistance for one or more attack scenarios.

If other strengths are sufficient, that may be a valid tradeoff,
but as I'm not really seing any major advantage of data-dependent
branches that cannot be havested other ways, I don't see why
you'd want to take the tradeoff.

Poul-Henning

PS: If you havn't read Feynmans Lectures on Computation, I will
recommend it:  There are some very interesting thoughts about
"reversible computing" which have inspired my own thinking about
how code you don't want to leak information through power, speed
or timing should be designed.

-- 
Poul-Henning Kamp       | UNIX since Zilog Zeus 3.20
phk@...eBSD.ORG         | TCP/IP since RFC 956
FreeBSD committer       | BSD since 4.3-tahoe    
Never attribute to malice what can adequately be explained by incompetence.