[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <e8037b6dd77c3105ac94e0fe68aad39e9f3b9656.camel@HansenPartnership.com>
Date: Sun, 22 Dec 2024 10:00:59 -0500
From: James Bottomley <James.Bottomley@...senPartnership.com>
To: Jarkko Sakkinen <jarkko.sakkinen@....fi>, Ard Biesheuvel
<ardb@...nel.org>, Jarkko Sakkinen <jarkko@...nel.org>
Cc: linux-integrity@...r.kernel.org, Peter Huewe <peterhuewe@....de>, Jason
Gunthorpe <jgg@...pe.ca>, Colin Ian King <colin.i.king@...il.com>, Joe
Hattori <joe@...is.s.u-tokyo.ac.jp>, Stefan Berger <stefanb@...ux.ibm.com>,
Roberto Sassu <roberto.sassu@...wei.com>, Al Viro
<viro@...iv.linux.org.uk>, Andy Liang <andy.liang@....com>, Matthew Garrett
<mjg59@...f.ucam.org>, Mimi Zohar <zohar@...ux.ibm.com>,
linux-kernel@...r.kernel.org
Subject: Re: [PATCH] tpm: Map the ACPI provided event log
On Sat, 2024-12-21 at 22:11 +0200, Jarkko Sakkinen wrote:
> On Sat Dec 21, 2024 at 7:16 PM EET, James Bottomley wrote:
> > On Sat, 2024-12-21 at 17:04 +0100, Ard Biesheuvel wrote:
> > > On Sat, 21 Dec 2024 at 12:33, Jarkko Sakkinen <jarkko@...nel.org>
> > > wrote:
> > > >
> > > > The following failure was reported:
> > > >
> > > > [ 10.693310][ T1] tpm_tis STM0925:00: 2.0 TPM (device-id
> > > > 0x3,
> > > > rev-id 0)
> > > > [ 10.848132][ T1] ------------[ cut here ]------------
> > > > [ 10.853559][ T1] WARNING: CPU: 59 PID: 1 at
> > > > mm/page_alloc.c:4727 __alloc_pages_noprof+0x2ca/0x330
> > > > [ 10.862827][ T1] Modules linked in:
> > > > [ 10.866671][ T1] CPU: 59 UID: 0 PID: 1 Comm: swapper/0
> > > > Not
> > > > tainted 6.12.0-lp155.2.g52785e2-default #1 openSUSE Tumbleweed
> > > > (unreleased) 588cd98293a7c9eba9013378d807364c088c9375
> > > > [ 10.882741][ T1] Hardware name: HPE ProLiant DL320
> > > > Gen12/ProLiant DL320 Gen12, BIOS 1.20 10/28/2024
> > > > [ 10.892170][ T1] RIP:
> > > > 0010:__alloc_pages_noprof+0x2ca/0x330
> > > > [ 10.898103][ T1] Code: 24 08 e9 4a fe ff ff e8 34 36 fa
> > > > ff e9
> > > > 88 fe ff ff 83 fe 0a 0f 86 b3 fd ff ff 80 3d 01 e7 ce 01 00 75
> > > > 09
> > > > c6 05 f8 e6 ce 01 01 <0f> 0b 45 31 ff e9 e5 fe ff ff f7 c2 00
> > > > 00 08
> > > > 00 75 42 89 d9 80 e1
> > > > [ 10.917750][ T1] RSP: 0000:ffffb7cf40077980 EFLAGS:
> > > > 00010246
> > > > [ 10.923777][ T1] RAX: 0000000000000000 RBX:
> > > > 0000000000040cc0
> > > > RCX: 0000000000000000
> > > > [ 10.931727][ T1] RDX: 0000000000000000 RSI:
> > > > 000000000000000c
> > > > RDI: 0000000000040cc0
> > > >
> > > > Above shows that ACPI pointed a 16 MiB buffer for the log
> > > > events
> > > > because RSI maps to the 'order' parameter of
> > > > __alloc_pages_noprof(). Address the bug by mapping the region
> > > > when
> > > > needed instead of copying.
> > > >
> > > > Reported-by: Andy Liang <andy.liang@....com>
> > > > Closes: https://bugzilla.kernel.org/show_bug.cgi?id=219495
> > > > Suggested-by: Matthew Garrett <mjg59@...f.ucam.org>
> > > > Signed-off-by: Jarkko Sakkinen <jarkko@...nel.org>
> > >
> > > This is a very intrusive fix - care to provide some more context
> > > on
> > > why all these changes are needed?
> >
> > Since the bug reports never found an actual log over a few tens of
> > kilobytes this is caused by the BIOS implementation allocating a
> > huge
> > buffer that is mostly unused.
> >
> > There are two other possibilities for fixing this, which were both
> > part
> > of the original suggestions. One would be to work out the size of
> > the
> > log and then allocate an exact size. This would require
> > implementing
> > tpm1 and tpm2 parsers for log size. However, since we can never go
> > over KMALLOC_MAX_SIZE without an error even with this calculated
> > size,
> > the simplest straight line fix would be to cap the copy at
> > KMALLOC_MAX_SIZE if it's over. That would be a simple one liner.
>
> All I'm saying is this.
>
> I've got bunch of complains of this from mainly SUSE, and now I'm
> here with a response to that feedback. So I don't care. You decide.
>
> I'm 100% sure that the fix that Stefan proposed is not a sustainable
> path in long-term, so I guess this was more like more long-term but
> intrusive fix.
If event logs grow to greater than KMALLOC_MAX_SIZE then absolutely it
makes sense to map them instead of copying them. But we'd have to do
that for all event log locators: ACPI, EFI and OF, because event log
size should be independent of the mechanism used to locate it. So,
even as a long term fix (assuming we think there's a possibility of
logs expanding by 50x), this patch doesn't do the right thing because
it only maps ACPI logs.
If you're determined to do the mapping approach for all logs, I don't
see why we shouldn't keep the log permanently mapped which would really
simplify the patch set. The main reason why ACPI memory is mapped and
unmapped is because it's usually I/O regions of devices for which we
have to use the device mapping primitives. However, for machine main
memory, which is where we know the log is, the mapping eventually boils
down to a nop.
Regards,
James
> Ya, and also please test the changes, especially anything that can
> reach of OF eventlogs would be welcome feedback.
>
> BR, Jarkko
>
Powered by blists - more mailing lists