linux-kernel - Re: [tpmdd-devel] [RFC] tpm2-space: add handling for global session exhaustion

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Message-ID: <20170131133148.r4xwwnpee7cdotlw@intel.com>
Date:   Tue, 31 Jan 2017 15:31:48 +0200
From:   Jarkko Sakkinen <jarkko.sakkinen@...ux.intel.com>
To:     James Bottomley <James.Bottomley@...senPartnership.com>
Cc:     tpmdd-devel@...ts.sourceforge.net,
        linux-security-module@...r.kernel.org,
        Ken Goldman <kgoldman@...ibm.com>, linux-kernel@...r.kernel.org
Subject: Re: [tpmdd-devel] [RFC] tpm2-space: add handling for global session
 exhaustion

On Mon, Jan 30, 2017 at 02:13:08PM -0800, James Bottomley wrote:
> On Mon, 2017-01-30 at 23:58 +0200, Jarkko Sakkinen wrote:
> > On Mon, Jan 30, 2017 at 08:04:55AM -0800, James Bottomley wrote:
> > > On Sun, 2017-01-29 at 19:52 -0500, Ken Goldman wrote:
> > > > On 1/27/2017 5:04 PM, James Bottomley wrote:
> > > > 
> > > > > > Beware the nasty corner case:
> > > > > > 
> > > > > > - Application asks for a session and gets 02000000
> > > > > > 
> > > > > > - Time elapses and 02000000 gets forcibly flushed
> > > > > > 
> > > > > > - Later, app comes back, asks for a second session and again
> > > > > > gets
> > > > > > 02000000.
> > > > > > 
> > > > > > - App gets very confused.
> > > > > > 
> > > > > > May it be better to close the connection completely, which
> > > > > > the
> > > > > > application can detect, than flush a session and give this
> > > > > > corner
> > > > > > case?
> > > > > 
> > > > > if I look at the code I've written, I don't know what the
> > > > > session
> > > > > number is, I just save sessionHandle in a variable for later
> > > > > use 
> > > > > (lets say to v1).  If I got the same session number returned at
> > > > > a 
> > > > > later time and placed it in v2, all I'd notice is that an 
> > > > > authorization using v1 would fail.  I'm not averse to killing
> > > > > the 
> > > > > entire connection but, assuming you have fallback, it might be 
> > > > > kinder simply to ensure that the operations with the reclaimed 
> > > > > session fail (which is what the code currently does).
> > > > 
> > > > My worry is that this session failure cannot be detected by the 
> > > > application.  An HMAC failure could cause the app to tell a user
> > > > that
> > > > they entered the wrong password.  Misleading.  On the TPM, it
> > > > could 
> > > > trigger the dictionary attack lockout.  For a PIN index, it could
> > > > consume a failure count.  Killing a policy session that has e.g.,
> > > > a 
> > > > policy signed term could cause the application to go back to some
> > > > external entity for another authorization signature.
> > > > 
> > > > Let's go up to the stack.  What's the attack?
> > > > 
> > > > If we're worried about many simultaneous applications (wouldn't
> > > > that 
> > > > be wonderful), why not just let startauthsession fail?  The 
> > > > application can just retry periodically.
> > > 
> > > How in that scenario do we ensure that a session becomes available?
> > >  Once that's established, there's no real difference between
> > > retrying
> > > the startauthsession in the kernel when we know the session is
> > > available and forcing userspace to do the retry except that the
> > > former
> > > has a far greater chance of success (and it's only about 6 lines of
> > > code).
> > > 
> > > >   Just allocate them in triples so there's no deadlock.
> > > 
> > > Is this the application or the kernel?  If it's the kernel, that
> > > adds a
> > > lot of complexity.
> > > 
> > > > If we're worried about a DoS attack, killing a session just helps
> > > > the
> > > > attacker.  The attacker can create a few connections and spin on 
> > > > startauthsession, locking everyone out anyway.
> > > 
> > > There are two considerations here: firstly we'd need to introduce a
> > > mechanism to "kill" the connection.  Probably we'd simply error
> > > every
> > > command on the space until it was closed.  The second is which
> > > scenario
> > > is more reasonable: Say the application simply forgot to flush the
> > > session and will never use it again.  Simply reclaiming the session
> > > would produce no effect at all on the application in this scenario.
> > >  However, I have no data to say what's likely.
> > > 
> > > > ~~
> > > > 
> > > > Also, let's remember that this is a rare application.  Sessions
> > > > are 
> > > > only needed for remote access (requiring encryption, HMAC or
> > > > salt), 
> > > > or policy sessions.
> > > 
> > > This depends what your threat model is.  For ssh keys, you worry
> > > that
> > > someone might be watching, so you use HMAC authority even for a
> > > local
> > > TPM.  In the cloud, you don't quite know where the TPM is, so again
> > > you'd use HMAC sessions ... however, in both use cases the sessions
> > > should be very short lived.
> > > 
> > > > ~~
> > > > 
> > > > Should the code also reserve a session for the kernel?  Mark it
> > > > not 
> > > > kill'able?
> > > 
> > > At the moment, the kernel doesn't use sessions, so let's worry
> > > about
> > > that problem at the point it arises (if it ever arises).
> > > 
> > > James
> > 
> > It does. My trusted keys implementation actually uses sessions.
> 
> But as I read the code, I can't find where the kernel creates a
> session.  It looks like the session and hmac are passed in as option
> arguments, aren't they?

Yes. Sorry, I mixed up things.

> > I'm kind dilating to an opinion that we would leave this commit out 
> > from the first kernel release that will contain the resource manager 
> > with similar rationale as Jason gave me for whitelisting: get the 
> > basic stuff in and once it is used with some workloads whitelisting 
> > and exhaustion will take eventually the right form.
> > 
> > How would you feel about this?
> 
> As long as we get patch 1/2 then applications using sessions will
> actually work with spaces, so taking more time with 2/2 is fine by me.
> 
> James

1/2 contains code that with a few iterations it is in the form that I'm
able to merge it.

With 2/2 I'm not saying it is wrong approach but I cannot yet say that
I'm confident that it would be the best approach.

I think that the transient object and infrastructure stuff that is
already in the patch set and 1/2 session is the subset of commits where
we can be fairly confident that we are doing the right thing.

I'll start preparing a patch set with this content without RFC tag.

/Jarkko