linux-kernel - Re: [BUG/PATCH] kernel RNG and its secrets

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Date:	Wed, 18 Mar 2015 20:53:34 -0300
From:	Cesar Eduardo Barros <cesarb@...arb.eti.br>
To:	mancha <mancha1@...o.com>, Stephan Mueller <smueller@...onox.de>
CC:	Hannes Frederic Sowa <hannes@...essinduktion.org>,
	Daniel Borkmann <daniel@...earbox.net>, tytso@....edu,
	linux-kernel@...r.kernel.org, linux-crypto@...r.kernel.org,
	herbert@...dor.apana.org.au, dborkman@...hat.com
Subject: Re: [BUG/PATCH] kernel RNG and its secrets

On 18-03-2015 14:14, mancha wrote:
> On Wed, Mar 18, 2015 at 05:02:01PM +0100, Stephan Mueller wrote:
>> Am Mittwoch, 18. März 2015, 16:09:34 schrieb Hannes Frederic Sowa:
>>> Seems like just using barrier() is the best and easiest option.
>
> However, if the idea is to use barrier() instead of OPTIMIZER_HIDE_VAR()
> in crypto_memneq() as well, then patch 0002 is the one to use. Please
> review and keep in mind my analysis was limited to memzero_explicit().
>
> Cesar, were there reasons you didn't use the gcc version of barrier()
> for crypto_memneq()?

Yes. Two reasons.

Take a look at how barrier() is defined:

#define barrier() __asm__ __volatile__("": : :"memory")

It tells gcc that the dummy assembly "instruction" touches memory (so 
the compiler can't assume anything about the memory), and that nothing 
should be moved from before to after the barrier and vice versa.

It mentions nothing about registers. Therefore, as far as I know gcc can 
assume that the dummy "instruction" touches no integer registers (or 
restores their values). I can imagine a sufficiently perverse compiler 
using that fact to introduce timing-dependent computations. For 
instance, it could load the values using more than one register and 
combine them at the end, after the barriers; there, it could exit early 
in case one of the registers is all-ones. My definition of 
OPTIMIZER_HIDE_VAR introduces a data dependency to prevent that:

#define OPTIMIZER_HIDE_VAR(var) __asm__ ("" : "=r" (var) : "0" (var))

The second reason is that barrier() is too strong. For crypto_memneq, 
only the or-chain is critical; the order or width of the loads makes no 
difference. The compiler could, if it wishes, do all the loads and xors 
first and do the or-chain at the end, or whenever it can see a pipeline 
bubble; it doesn't matter as long as it does *all* the "or" operations, 
in sequence.

I would be comfortable with a stronger OPTIMIZER_HIDE_VAR (adding 
"memory" or volatile), even though it could limit optimization 
opportunities, but I wouldn't be comfortable with a weaker 
OPTIMIZER_HIDE_VAR (removing the data dependency), unless the gcc and 
clang guys promise that our definition of barrier() will always prevent 
undesired optimization of register-only operations.

There was a third reason for the exact definition of OPTIMIZER_HIDE_VAR: 
it was copied from RELOC_HIDE, which is a longstanding "hide this 
variable from gcc" operation, and thus known to work as expected.

-- 
Cesar Eduardo Barros
cesarb@...arb.eti.br
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/