linux-kernel - RE: [PATCH] x86/entry/64: randomize kernel stack offset upon syscall

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <2236FBA76BA1254E88B949DDB74E612BA4C66B18@IRSMSX102.ger.corp.intel.com>
Date:   Mon, 29 Apr 2019 08:04:50 +0000
From:   "Reshetova, Elena" <elena.reshetova@...el.com>
To:     Eric Biggers <ebiggers@...nel.org>, Theodore Ts'o <tytso@....edu>,
        "herbert@...dor.apana.org.au" <herbert@...dor.apana.org.au>,
        David Laight <David.Laight@...lab.com>,
        Ingo Molnar <mingo@...nel.org>,
        'Peter Zijlstra' <peterz@...radead.org>,
        "keescook@...omium.org" <keescook@...omium.org>,
        Daniel Borkmann <daniel@...earbox.net>,
        "luto@...nel.org" <luto@...nel.org>,
        "luto@...capital.net" <luto@...capital.net>,
        "linux-kernel@...r.kernel.org" <linux-kernel@...r.kernel.org>,
        "jpoimboe@...hat.com" <jpoimboe@...hat.com>,
        "jannh@...gle.com" <jannh@...gle.com>,
        "Perla, Enrico" <enrico.perla@...el.com>,
        "mingo@...hat.com" <mingo@...hat.com>,
        "bp@...en8.de" <bp@...en8.de>,
        "tglx@...utronix.de" <tglx@...utronix.de>,
        "gregkh@...uxfoundation.org" <gregkh@...uxfoundation.org>,
        "Edgecombe, Rick P" <rick.p.edgecombe@...el.com>
Subject: RE: [PATCH] x86/entry/64: randomize kernel stack offset upon syscall

> On Fri, Apr 26, 2019 at 10:01:02AM -0400, Theodore Ts'o wrote:
> > On Fri, Apr 26, 2019 at 11:33:09AM +0000, Reshetova, Elena wrote:
> > > Adding Eric and Herbert to continue discussion for the chacha part.
> > > So, as a short summary I am trying to find out a fast (fast enough to be used per
> syscall
> > > invocation) source of random bits with good enough security properties.
> > > I started to look into chacha kernel implementation and while it seems that it is
> designed to
> > > work with any number of rounds, it does not expose less than 12 rounds
> primitive.
> > > I guess this is done for security sake, since 12 is probably the lowest bound we
> want people
> > > to use for the purpose of encryption/decryption, but if we are to build an
> efficient RNG,
> > > chacha8 probably is a good tradeoff between security and speed.
> > >
> > > What are people's opinions/perceptions on this? Has it been considered before
> to create a
> > > kernel RNG based on chacha?
> >
> > Well, sure.  The get_random_bytes() kernel interface and the
> > getrandom(2) system call uses a CRNG based on chacha20.  See
> > extract_crng() and crng_reseed() in drivers/char/random.c.
> >
> > It *is* possible to use an arbitrary number of rounds if you use the
> > low level interface exposed as chacha_block(), which is an
> > EXPORT_SYMBOL interface so even modules can use it.  "Does not expose
> > less than 12 rounds" applies only if you are using the high-level
> > crypto interface.
> 
> chacha_block() actually WARNs if the round count isn't 12 or 20, because I
> didn't want people to sneak in uses of other variants without discussion :-)
> 
> (Possibly I should have made chacha_block() 'static' and only exported
> chacha12_block() and chacha20_block().  But the 'nrounds' parameter is
> convenient for crypto/chacha_generic.c.)
> 
> >
> > We have used cut down crypto algorithms for performance critical
> > applications before; at one point, we were using a cut down MD4(!) for
> > initial TCP sequence number generation.  But that was getting rekeyed
> > every five minutes, and the goal was to make it just hard enough that
> > there were other easier ways of DOS attacking a server.
> >
> > I'm not a cryptographer, so I'd really us to hear from multiple
> > experts about the security level of, say, ChaCha8 so we understand
> > exactly kind of security we'd offering.  And I'd want that interface
> > to be named so that it's clear it's only intended for a very specific
> > use case, since it will be tempting for other kernel developers to use
> > it in other contexts, with undue consideration.
> >
> >       	    	      	   	 - Ted
> 
> The best attack on ChaCha is against 7 rounds and has time complexity 2^235.  So
> while there's no publicly known attack on ChaCha8, its security margin is too
> small for it to be recommended for typical cryptographic use.  I wouldn't be
> suprised to see an attack published on ChaCha8 in the not-too-distant future.
> (*Probably* not a practical one, but the crypto will be technically "broken"
> regardless.)

Yes, this is also what is my understanding with regards to chacha official 
security strength. But our use case and requirements are slightly different
and can be in future upgraded, if needed, but let's indeed then try per-cpu
buffer solution that Andy is proposing first to see if it is satisfactory performance-
wise with chacha20, which probably stays secure for much longer unless whole
construction is fully broken. 

> 
> I don't think it's completely out of the question for this specific use case,
> since apparently you only need random numbers that are used temporarily for
> runtime memory layout.  Thus the algorithm can be upgraded at any time without
> spending decades deprecating it from network protocols and on-disk formats.
> 
> But if you actually need cryptographically secure random numbers, it would be
> much better to start with something with a higher security margin like ChaCha20,
> optimizing it, and only going lower if you actually need to.

Yes, agree, so this is what I am going to try then.

> Would it be possibly to call ChaCha20 through the actual crypto API so that SIMD
> instructions (e.g. AVX-2) could be used?  That would make it *much* faster.

I can try measuring both ways given that we ask for enough random bits as Andy suggested. 
Couple of pages or so, if it helps with overhead. Also, hope none of these specific 
Instructions (including AES-NI) can block, as I was pointed out with RDRAND, otherwise
I guess we have a problem.. 

> Also consider AES-CTR with AES-NI instructions.

Yes, I guess based on these numbers they go hand in hand with chacha8 (depending on CPU):
https://bench.cr.yp.to/results-stream.html
Also need to compare cases when no special instructions are available I guess when choosing a primitive
here... 

Best Regards,
Elena.