lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Message-ID: <20190313132353.GC4261@linux.intel.com>
Date:   Wed, 13 Mar 2019 15:23:53 +0200
From:   Jarkko Sakkinen <jarkko.sakkinen@...ux.intel.com>
To:     Mimi Zohar <zohar@...ux.ibm.com>
Cc:     Calvin Owens <calvinowens@...com>, Peter Huewe <peterhuewe@....de>,
        Jason Gunthorpe <jgg@...pe.ca>, Arnd Bergmann <arnd@...db.de>,
        Greg Kroah-Hartman <gregkh@...uxfoundation.org>,
        linux-integrity@...r.kernel.org, linux-kernel@...r.kernel.org,
        kernel-team@...com
Subject: Re: [PATCH] tpm: Make timeout logic simpler and more robust

On Wed, Mar 13, 2019 at 03:22:32PM +0200, Jarkko Sakkinen wrote:
> On Tue, Mar 12, 2019 at 01:04:58PM -0400, Mimi Zohar wrote:
> > On Mon, 2019-03-11 at 16:54 -0700, Calvin Owens wrote:
> > > We're having lots of problems with TPM commands timing out, and we're
> > > seeing these problems across lots of different hardware (both v1/v2).
> > > 
> > > I instrumented the driver to collect latency data, but I wasn't able to
> > > find any specific timeout to fix: it seems like many of them are too
> > > aggressive. So I tried replacing all the timeout logic with a single
> > > universal long timeout, and found that makes our TPMs 100% reliable.
> > > 
> > > Given that this timeout logic is very complex, problematic, and appears
> > > to serve no real purpose, I propose simply deleting all of it.
> > 
> > Normally before sending such a massive change like this, included in
> > the bug report or patch description, there would be some indication as
> > to which kernel introduced a regression.  Has this always been a
> > problem?  Is this something new?  How new?
> 
> Also: is the problem in timeouts, durations or both. Does make sense
> to fix something that isn't broken...

And maybe the fix is a too big hammer. We could possibly just decrease
the granularity but fully take it away.

/Jarkko

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ