lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20240122160227.00002d83@Huawei.com>
Date: Mon, 22 Jan 2024 16:02:27 +0000
From: Jonathan Cameron <Jonathan.Cameron@...wei.com>
To: "Russell King (Oracle)" <linux@...linux.org.uk>
CC: "Rafael J. Wysocki" <rafael@...nel.org>, <linux-pm@...r.kernel.org>,
	<loongarch@...ts.linux.dev>, <linux-acpi@...r.kernel.org>,
	<linux-arch@...r.kernel.org>, <linux-kernel@...r.kernel.org>,
	<linux-arm-kernel@...ts.infradead.org>, <linux-riscv@...ts.infradead.org>,
	<kvmarm@...ts.linux.dev>, <x86@...nel.org>,
	<acpica-devel@...ts.linuxfoundation.org>, <linux-csky@...r.kernel.org>,
	<linux-doc@...r.kernel.org>, <linux-ia64@...r.kernel.org>,
	<linux-parisc@...r.kernel.org>, Salil Mehta <salil.mehta@...wei.com>,
	Jean-Philippe Brucker <jean-philippe@...aro.org>, <jianyong.wu@....com>,
	<justin.he@....com>, James Morse <james.morse@....com>
Subject: Re: [PATCH RFC v3 03/21] ACPI: processor: Register CPUs that are
 online, but not described in the DSDT

On Mon, 15 Jan 2024 11:06:29 +0000
"Russell King (Oracle)" <linux@...linux.org.uk> wrote:

> On Mon, Dec 18, 2023 at 09:22:03PM +0100, Rafael J. Wysocki wrote:
> > On Wed, Dec 13, 2023 at 1:49 PM Russell King <rmk+kernel@...linux.org.uk> wrote:  
> > >
> > > From: James Morse <james.morse@....com>
> > >
> > > ACPI has two descriptions of CPUs, one in the MADT/APIC table, the other
> > > in the DSDT. Both are required. (ACPI 6.5's 8.4 "Declaring Processors"
> > > says "Each processor in the system must be declared in the ACPI
> > > namespace"). Having two descriptions allows firmware authors to get
> > > this wrong.
> > >
> > > If CPUs are described in the MADT/APIC, they will be brought online
> > > early during boot. Once the register_cpu() calls are moved to ACPI,
> > > they will be based on the DSDT description of the CPUs. When CPUs are
> > > missing from the DSDT description, they will end up online, but not
> > > registered.
> > >
> > > Add a helper that runs after acpi_init() has completed to register
> > > CPUs that are online, but weren't found in the DSDT. Any CPU that
> > > is registered by this code triggers a firmware-bug warning and kernel
> > > taint.
> > >
> > > Qemu TCG only describes the first CPU in the DSDT, unless cpu-hotplug
> > > is configured.  
> > 
> > So why is this a kernel problem?  
> 
> So what are you proposing should be the behaviour here? What this
> statement seems to be saying is that QEMU as it exists today only
> describes the first CPU in DSDT.

This confuses me somewhat, because I'm far from sure which machines this
is true for in QEMU.  I'm guessing it's a legacy thing with
some old distro version of QEMU - so we'll have to paper over it anyway
but for current QEMU I'm not sure it's true.

Helpfully there are a bunch of ACPI table tests so I've been checking
through all the multi CPU cases.

CPU hotplug not enabled.
pc/DSDT.dimmpxm  - 4x Processor entries.  -smp 4
pc/DSDT.acpihmat - 2x Processor entries.  -smp 2
q35/DSDT.acpihmat - 2x Processor entries. -smp 2
virt/DSDT.acpihmatvirt - 4x ACPI0007 entries -smp 4
q35/DSDT.acpihmat-noinitiator - 4 x Processor () entries -smp 4 
virt/DSDT.topology - 8x ACPI0007 entries

I've also looked at the code and we have various types of
CPU hotplug on x86 but they all build appropriate numbers of
Processor() entries in DSDT.
Arm likewise seems to build the right number of ACPI0007 entries
(and doesn't yet have CPU HP support).

If anyone can add a reference on why this is needed that would be very
helpful.

> 
> As this patch series changes when arch_register_cpu() gets called (as
> described in the paragraph above) we obviously need to preserve the
> _existing_ behaviour to avoid causing regressions. So, if changing the
> kernel causes user visible regressions (e.g. sysfs entries to
> disappear) then it obviously _is_ a kernel problem that needs to be
> solved.
> 
> We can't say "well fix QEMU then" without invoking the wrath of Linus.

Overall I'm fine with the defensive nature of this patch as there
'might' be firmware out there with this problem - I just can't establish
that there is!  If anyone else recalls the history of this then give
a shout.  I vaguely wondered if this was an ia64 thing but nope, QEMU
never generated tables for ia64 before dropping support back in QEMU 2.11


> 
> > > Signed-off-by: James Morse <james.morse@....com>
> > > Reviewed-by: Jonathan Cameron <Jonathan.Cameron@...wei.com>
> > > Reviewed-by: Gavin Shan <gshan@...hat.com>
> > > Tested-by: Miguel Luis <miguel.luis@...cle.com>
> > > Tested-by: Vishnu Pajjuri <vishnu@...amperecomputing.com>
> > > Tested-by: Jianyong Wu <jianyong.wu@....com>
> > > Signed-off-by: Russell King (Oracle) <rmk+kernel@...linux.org.uk>
> > > ---
> > >  drivers/acpi/acpi_processor.c | 19 +++++++++++++++++++
> > >  1 file changed, 19 insertions(+)
> > >
> > > diff --git a/drivers/acpi/acpi_processor.c b/drivers/acpi/acpi_processor.c
> > > index 6a542e0ce396..0511f2bc10bc 100644
> > > --- a/drivers/acpi/acpi_processor.c
> > > +++ b/drivers/acpi/acpi_processor.c
> > > @@ -791,6 +791,25 @@ void __init acpi_processor_init(void)
> > >         acpi_pcc_cpufreq_init();
> > >  }
> > >
> > > +static int __init acpi_processor_register_missing_cpus(void)
> > > +{
> > > +       int cpu;
> > > +
> > > +       if (acpi_disabled)
> > > +               return 0;
> > > +
> > > +       for_each_online_cpu(cpu) {
> > > +               if (!get_cpu_device(cpu)) {
> > > +                       pr_err_once(FW_BUG "CPU %u has no ACPI namespace description!\n", cpu);
> > > +                       add_taint(TAINT_FIRMWARE_WORKAROUND, LOCKDEP_STILL_OK);
> > > +                       arch_register_cpu(cpu);  
> > 
> > Which part of this code is related to ACPI?  
> 
> That's a good question, and I suspect it would be more suited to being
> placed in drivers/base/cpu.c except for the problem that the error
> message refers to ACPI.
> 
> As long as we keep the acpi_disabled test, I guess that's fine.
> cpu_dev_register_generic() there already tests acpi_disabled.
> 
Moving it seems fine to me.

Jonathan


Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ