[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-Id: <CS68AWILHXS4.3M36M1EKZLUMS@suppilovahvero>
Date: Wed, 26 Apr 2023 02:34:25 +0300
From: "Jarkko Sakkinen" <jarkko@...nel.org>
To: "Jarkko Sakkinen" <jarkko@...nel.org>,
"Jason A. Donenfeld" <Jason@...c4.com>
Cc: "Thorsten Leemhuis" <regressions@...mhuis.info>,
"James Bottomley" <James.Bottomley@...senpartnership.com>,
"Vlastimil Babka" <vbabka@...e.cz>,
"Peter Huewe" <peterhuewe@....de>,
"Jason Gunthorpe" <jgg@...pe.ca>, "Jan Dabros" <jsd@...ihalf.com>,
<regressions@...ts.linux.dev>,
"LKML" <linux-kernel@...r.kernel.org>,
<linux-integrity@...r.kernel.org>,
"Dominik Brodowski" <linux@...inikbrodowski.net>,
"Herbert Xu" <herbert@...dor.apana.org.au>,
"Linus Torvalds" <torvalds@...ux-foundation.org>,
"Johannes Altmanninger" <aclopte@...il.com>
Subject: Re: [REGRESSION] suspend to ram fails in 6.2-rc1 due to tpm errors
On Sun Apr 23, 2023 at 6:34 PM EEST, Jarkko Sakkinen wrote:
> On Fri Apr 21, 2023 at 9:27 PM EEST, Jason A. Donenfeld wrote:
> > Did you use the patch I sent you and suspend and resume according to
> > the instructions I gave you? If not, I don't have much to add.
>
> Finally, I got it reproduced at my side with TPM 1.2:
>
> [ 0.379677] tpm_tis 00:00: 1.2 TPM (device-id 0x1, rev-id 1)
> [ 32.453447] tpm tpm0: tpm_transmit: tpm_recv: error -5
> [ 33.450601] tpm tpm0: Unable to read header
> [ 33.450607] tpm tpm0: tpm_transmit: tpm_recv: error -62
>
> I'll look at this further after I've sent v6.3 PR.
OK, so this gives the exact tpm_transmit call where it fails:
$ sudo bpftrace -e 'kprobe:tpm_transmit { @[kstack] = count(); }'
[sudo] password for jarkko:
Attaching 1 probe...
^C
@[
tpm_transmit+1
tpm1_pcr_read+177
tpm1_do_selftest+287
tpm_tis_resume+443
pnp_bus_resume+102
dpm_run_callback+81
device_resume+173
dpm_resume+238
dpm_resume_end+17
suspend_devices_and_enter+473
enter_state+563
pm_suspend+68
state_store+43
kobj_attr_store+15
sysfs_kf_write+59
kernfs_fop_write_iter+304
vfs_write+590
ksys_write+115
__x64_sys_write+25
do_syscall_64+88
entry_SYSCALL_64_after_hwframe+114
]: 1
@[
tpm_transmit+1
tpm1_do_selftest+179
tpm_tis_resume+443
pnp_bus_resume+102
dpm_run_callback+81
device_resume+173
dpm_resume+238
dpm_resume_end+17
suspend_devices_and_enter+473
enter_state+563
pm_suspend+68
state_store+43
kobj_attr_store+15
sysfs_kf_write+59
kernfs_fop_write_iter+304
vfs_write+590
ksys_write+115
__x64_sys_write+25
do_syscall_64+88
entry_SYSCALL_64_after_hwframe+114
]: 1
@[
tpm_transmit+1
tpm1_pm_suspend+203
tpm_pm_suspend+131
__pnp_bus_suspend+65
pnp_bus_suspend+19
dpm_run_callback+81
__device_suspend+329
dpm_suspend+432
dpm_suspend_start+155
suspend_devices_and_enter+370
enter_state+563
pm_suspend+68
state_store+43
kobj_attr_store+15
sysfs_kf_write+59
kernfs_fop_write_iter+304
vfs_write+590
ksys_write+115
__x64_sys_write+25
do_syscall_64+88
entry_SYSCALL_64_after_hwframe+114
]: 1
@[
tpm_transmit+1
tpm1_get_random+206
tpm_get_random+70
tpm_hwrng_read+21
hwrng_fillfn+234
kthread+230
ret_from_fork+41
]: 75897
So it is the very first PCR read in tpm1_do_selftest.
There is a bug at plain sight in tpm1_tis_resume(): before
tpm_tis_resume() calls tpm1_do_selftest(), it only requests
and relinquishes locality. This is not sufficient: it should
also disable clkrun protocol.
tpm1_do_selftest() is called also during the driver initialization
successfully, the difference being that clkrun protocol is disabled.
I'm compiling now a kernel with a test fix that calls tpm_chip_start()
and tpm_chip_stop() as a substitute for request/relinquish locality.
These should be used anyway instead of ad-hoc code.
BR, Jarkko
BR, Jarkko
Powered by blists - more mailing lists