[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <Y7hF5vG8rWjbCLyL@zx2c4.com>
Date: Fri, 6 Jan 2023 17:01:42 +0100
From: "Jason A. Donenfeld" <Jason@...c4.com>
To: Thorsten Leemhuis <regressions@...mhuis.info>,
James Bottomley <James.Bottomley@...senpartnership.com>,
Peter Huewe <peterhuewe@....de>,
Jarkko Sakkinen <jarkko@...nel.org>,
Jason Gunthorpe <jgg@...pe.ca>, Jan Dabros <jsd@...ihalf.com>,
regressions@...ts.linux.dev, LKML <linux-kernel@...r.kernel.org>,
linux-integrity@...r.kernel.org,
Dominik Brodowski <linux@...inikbrodowski.net>,
Herbert Xu <herbert@...dor.apana.org.au>,
Johannes Altmanninger <aclopte@...il.com>,
stable@...r.kernel.org,
Linus Torvalds <torvalds@...ux-foundation.org>,
Vlastimil Babka <vbabka@...e.cz>, tbroch@...omium.org,
semenzato@...omium.org, dbasehore@...omium.org,
keescook@...omium.org
Subject: Re: [PATCH v2] tpm: Allow system suspend to continue when TPM
suspend fails
Hi Todd & ChromeOS folks,
On Fri, Jan 06, 2023 at 04:01:56AM +0100, Jason A. Donenfeld wrote:
> TPM 1 is sometimes broken across system suspends, due to races or
> locking issues or something else that haven't been diagnosed or fixed
> yet, most likely having to do with concurrent reads from the TPM's
> hardware random number generator driver. These issues prevent the system
> from actually suspending, with errors like:
>
> tpm tpm0: A TPM error (28) occurred continue selftest
> ...
> tpm tpm0: A TPM error (28) occurred attempting get random
> ...
> tpm tpm0: Error (28) sending savestate before suspend
> tpm_tis 00:08: PM: __pnp_bus_suspend(): tpm_pm_suspend+0x0/0x80 returns 28
> tpm_tis 00:08: PM: dpm_run_callback(): pnp_bus_suspend+0x0/0x10 returns 28
> tpm_tis 00:08: PM: failed to suspend: error 28
> PM: Some devices failed to suspend, or early wake event detected
>
> This issue was partially fixed by 23393c646142 ("char: tpm: Protect
> tpm_pm_suspend with locks"), in a last minute 6.1 commit that Linus took
> directly because the TPM maintainers weren't available. However, it
> seems like this just addresses the most common cases of the bug, rather
> than addressing it entirely. So there are more things to fix still,
> apparently.
>
> In lieu of actually fixing the underlying bug, just allow system suspend
> to continue, so that laptops still go to sleep fine. Later, this can be
> reverted when the real bug is fixed.
>
> Link: https://lore.kernel.org/lkml/7cbe96cf-e0b5-ba63-d1b4-f63d2e826efa@suse.cz/
> Cc: stable@...r.kernel.org # 6.1+
> Reported-by: Vlastimil Babka <vbabka@...e.cz>
> Suggested-by: Linus Torvalds <torvalds@...ux-foundation.org>
> Signed-off-by: Jason A. Donenfeld <Jason@...c4.com>
> ---
> This is basically untested and I haven't worked out if there are any
> awful implications of letting the system sleep when TPM suspend fails.
> Maybe some PCRs get cleared and that will make everything explode on
> resume? Maybe it doesn't matter? Somebody well versed in TPMology should
> probably [n]ack this approach.
When idling scrolling on my telephone to try to see what the
implications of skipping TPM_ORD_SAVESTATE could be, I stumbled across
some ChromeOS commits related to it, and realized that, ah-hah, finally
there's an obvious group of stakeholders who make heavy use of the TPM
and have likely amassed some expertise on it.
So I was wondering if you'd take a look at this patch briefly to make
sure it won't break ChromeOS laptops.
Jason
Powered by blists - more mailing lists