[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <6df9f7af232bbe10a570e426c2bef0e673ab63fe.camel@HansenPartnership.com>
Date: Wed, 18 May 2022 16:21:39 -0400
From: James Bottomley <James.Bottomley@...senPartnership.com>
To: Nayna <nayna@...ux.vnet.ibm.com>,
Jarkko Sakkinen <jarkko@...nel.org>,
Johannes Holland <johannes.holland@...ineon.com>
Cc: Mimi Zohar <zohar@...ux.ibm.com>, linux-integrity@...r.kernel.org,
linux-kernel@...r.kernel.org, peterhuewe@....de, jgg@...pe.ca
Subject: Re: [PATCH] tpm: sleep at least <...> ms in tpm_msleep()
On Wed, 2022-05-18 at 15:26 -0400, Nayna wrote:
> On 5/16/22 13:57, Jarkko Sakkinen wrote:
> > On Thu, May 12, 2022 at 08:32:55AM -0400, James Bottomley wrote:
> > > On Thu, 2022-05-12 at 08:21 -0400, Mimi Zohar wrote:
[...]
> > > > This patch reverts commit 5ef924d9e2e8 ("tpm: use tpm_msleep()
> > > > value as max delay"). Are you experiencing TPM issues that
> > > > require it?
> > > I am:
> > >
> > > https://lore.kernel.org/linux-integrity/1531328689.3260.8.camel@HansenPartnership.com/
> > >
> > > I'm about 24h into a soak test of the patch with no TPM failure
> > > so far. I think it probably needs to run another 24h just to be
> > > sure, but it does seem the theory is sound (my TPM gets annoyed
> > > by being poked too soon) so reverting 5ef924d9e2e8 looks to be
> > > the correct action. The only other ways I've found to fix this
> > > are either revert the usleep_range patch altogether or increase
> > > the timings:
> > >
> > > https://lore.kernel.org/linux-integrity/1531329074.3260.9.camel@HansenPartnership.com/
> > >
> > > Which obviously pushes the min past whatever issue my TPM is
> > > having even with 5ef924d9e2e8 applied.
> > >
> > > Given that even the commit message for 5ef924d9e2e8 admits it
> > > only shaves about 12% off the TPM response time, that would
> > > appear to be an optimization too far if it's going to cause some
> > > TPMs to fail.
> > >
> > > James
> > What if TPM started with the timings as they are now and use the
> > "reverted" timings if coming up too early? The question here is
> > though, is such complexity worth of anything or should we just
> > revert and do nothing else.
>
> TCG Specification(TCG PC Client Device Driver Design Principles,
> Section 10), says - General control timeouts, denoted as TIMEOUT_A
> (A), TIMEOUT_B (B), TIMEOUT_C (C) and TIMEOUT_D (D), are the maximum
> waiting time from a certain control operation from the DD until the
> TPM shows the expected status change.
Actually, this is nothing to do with the TIMEOUTS_A-D: those are the
maximum times before a command should complete. This is the minimum
time we should wait between pokes of the TPM to see if it is ready.
Usually the use case is:
while (read device status gives not ready)
tpm_msleep(something)
The tpm_msleep gives up CPU control (to prevent huge amounts of busy
waiting) but before 424eaf910c32 ("tpm: reduce polling time to usecs
for even finer granularity") we would sleep for an entire tick (time
taken to make the process runnable) before the next poll, and since
most TPM commands don't return immediately, that was a gate on how fast
you could do simple TPM operations (like PCR extend).
As far as I know, no TCG spec gives any details of the minimum wait
time between poll cycles, so this is really something the manufacturer
has to tell us.
Just for completeness, my soak test did run to completion, but my TPM
has since failed and dropped off the bus, so simply reverting this
patch (5ef924d9e2e8) isn't sufficient to fully fix my problem.
James
Powered by blists - more mailing lists