lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite for Android: free password hash cracker in your pocket
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <ac85fe73b3d7f46ce96d5033f8cf58a1b6b001f3.camel@intel.com>
Date: Tue, 15 Oct 2024 03:23:27 +0000
From: "Zhang, Rui" <rui.zhang@...el.com>
To: "jmattson@...gle.com" <jmattson@...gle.com>
CC: "ajorgens@...gle.com" <ajorgens@...gle.com>, "myrade@...gle.com"
	<myrade@...gle.com>, "bp@...en8.de" <bp@...en8.de>, "x86@...nel.org"
	<x86@...nel.org>, "peterz@...radead.org" <peterz@...radead.org>, "Tang, Feng"
	<feng.tang@...el.com>, "linux-kernel@...r.kernel.org"
	<linux-kernel@...r.kernel.org>, "tglx@...utronix.de" <tglx@...utronix.de>,
	"Wysocki, Rafael J" <rafael.j.wysocki@...el.com>,
	"linux-acpi@...r.kernel.org" <linux-acpi@...r.kernel.org>, "jay.chen@....com"
	<jay.chen@....com>, "vladteodor@...gle.com" <vladteodor@...gle.com>,
	"jon.grimm@....com" <jon.grimm@....com>
Subject: Re: [RFC PATCH] x86/acpi: Ignore invalid x2APIC entries

On Mon, 2024-10-14 at 11:00 -0700, Jim Mattson wrote:
> On Mon, Oct 14, 2024 at 6:05 AM Zhang, Rui <rui.zhang@...el.com>
> wrote:
> > 
> > > > > 
> > > > > TBH, I'm not sure that there is actually anything wrong with
> > > > > the
> > > > > new
> > > > > numbering scheme.
> > > > >  The topology is reported correctly (e.g. in
> > > > > /sys/devices/system/cpu/cpu0/topology/thread_siblings_list).
> > > > > Yet,
> > > > > the
> > > > > new enumeration does seem to contradict user expectations.
> > > > > 
> > > > 
> > > > Well, we can say this is a violation of the ACPI spec.
> > > > "OSPM should initialize processors in the order that they
> > > > appear in
> > > > the
> > > > MADT." even for interleaved LAPIC and X2APIC entries.
> > > 
> > > Ah. Thanks. I didn't know that.
> > > 
> > > > Maybe we need two steps for LAPIC/X2APIC parsing.
> > > > 1. check if there is valid LAPIC entry by going through all
> > > > LAPIC
> > > > entries first
> > > > 2. parse LAPIC/X2APIC strictly following the order in MADT.
> > > > (like
> > > > we do
> > > > before)
> > > 
> > > That makes sense to me.
> > > 
> > > Thanks,
> > > 
> > > --jim
> > 
> > Hi, Jim,
> > 
> > Please check if below patch restores the CPU IDs or not.
> > 
> > thanks,
> > rui
> > 
> > From ec786dfe693cad2810b54b0d8afbfc7e4c4b3f8a Mon Sep 17 00:00:00
> > 2001
> > From: Zhang Rui <rui.zhang@...el.com>
> > Date: Mon, 14 Oct 2024 13:26:55 +0800
> > Subject: [PATCH] x86/acpi: Fix LAPIC/x2APIC parsing order
> > 
> > On some systems, the same CPU (with same APIC ID) is assigned with
> > a
> > different logical CPU id after commit ec9aedb2aa1a ("x86/acpi:
> > Ignore
> > invalid x2APIC entries").
> > 
> > This means Linux enumerates the CPUs in a different order and it is
> > a
> > violation of
> > https://uefi.org/specs/ACPI/6.5/05_ACPI_Software_Programming_Model.html#madt-processor-local-apic-sapic-structure-entry-order
> > ,
> > 
> >   "OSPM should initialize processors in the order that they appear
> > in
> >    the MADT"
> > 
> > The offending commit wants to ignore x2APIC entries with APIC ID <
> > 255
> > when valid LAPIC entries exist, so it parses all LAPIC entries
> > before
> > parsing any x2APIC entries. This breaks the CPU enumeration order
> > for
> > systems that have x2APIC entries listed before LAPIC entries in
> > MADT.
> > 
> > Fix the problem by checking the valid LAPIC entries separately,
> > before
> > parsing any LAPIC/x2APIC entries.
> > 
> > Cc: stable@...r.kernel.org
> > Reported-by: Jim Mattson <jmattson@...gle.com>
> > Closes:
> > https://lore.kernel.org/all/20241010213136.668672-1-jmattson@google.com/
> > Fixes: ec9aedb2aa1a ("x86/acpi: Ignore invalid x2APIC entries")
> > Signed-off-by: Zhang Rui <rui.zhang@...el.com>
> > ---
> >  arch/x86/kernel/acpi/boot.c | 50
> > +++++++++++++++++++++++++++++++++----
> >  1 file changed, 45 insertions(+), 5 deletions(-)
> > 
> > diff --git a/arch/x86/kernel/acpi/boot.c
> > b/arch/x86/kernel/acpi/boot.c
> > index 4efecac49863..c70b86f1f295 100644
> > --- a/arch/x86/kernel/acpi/boot.c
> > +++ b/arch/x86/kernel/acpi/boot.c
> > @@ -226,6 +226,28 @@ acpi_parse_x2apic(union acpi_subtable_headers
> > *header, const unsigned long end)
> >         return 0;
> >  }
> > 
> > +static int __init
> > +acpi_check_lapic(union acpi_subtable_headers *header, const
> > unsigned long end)
> > +{
> > +       struct acpi_madt_local_apic *processor = NULL;
> > +
> > +       processor = (struct acpi_madt_local_apic *)header;
> > +
> > +       if (BAD_MADT_ENTRY(processor, end))
> > +               return -EINVAL;
> > +
> > +       /* Ignore invalid ID */
> > +       if (processor->id == 0xff)
> > +               return 0;
> > +
> > +       /* Ignore processors that can not be onlined */
> > +       if (!acpi_is_processor_usable(processor->lapic_flags))
> > +               return 0;
> > +
> > +       has_lapic_cpus = true;
> > +       return 0;
> > +}
> > +
> >  static int __init
> >  acpi_parse_lapic(union acpi_subtable_headers * header, const
> > unsigned long end)
> >  {
> > @@ -257,7 +279,6 @@ acpi_parse_lapic(union acpi_subtable_headers *
> > header, const unsigned long end)
> >                                processor->processor_id, /* ACPI ID
> > */
> >                                processor->lapic_flags &
> > ACPI_MADT_ENABLED);
> > 
> > -       has_lapic_cpus = true;
> >         return 0;
> >  }
> > 
> > @@ -1029,6 +1050,8 @@ static int __init
> > early_acpi_parse_madt_lapic_addr_ovr(void)
> >  static int __init acpi_parse_madt_lapic_entries(void)
> >  {
> >         int count, x2count = 0;
> > +       struct acpi_subtable_proc madt_proc[2];
> > +       int ret;
> > 
> >         if (!boot_cpu_has(X86_FEATURE_APIC))
> >                 return -ENODEV;
> > @@ -1037,10 +1060,27 @@ static int __init
> > acpi_parse_madt_lapic_entries(void)
> >                                       acpi_parse_sapic,
> > MAX_LOCAL_APIC);
> > 
> >         if (!count) {
> > -               count =
> > acpi_table_parse_madt(ACPI_MADT_TYPE_LOCAL_APIC,
> > -                                       acpi_parse_lapic,
> > MAX_LOCAL_APIC);
> > -               x2count =
> > acpi_table_parse_madt(ACPI_MADT_TYPE_LOCAL_X2APIC,
> > -                                       acpi_parse_x2apic,
> > MAX_LOCAL_APIC);
> 
> The point is moot now, but I don't think the previous code did the
> right thing when acpi_table_parse_madt() returned a negative value
> (for errors).

Previous and current code checks for the negative value later after
parsing both LAPIC and x2APIC.
so what is the problem you're referring to?
Do you mean we should error out immediately when parsing LAPIC fails?

> 
> > +               /* Check if there are valid LAPIC entries */
> > +               acpi_table_parse_madt(ACPI_MADT_TYPE_LOCAL_APIC,
> > acpi_check_lapic, MAX_LOCAL_APIC);
> 
> Two comments:
> 
> 1) Should we check for a return value < 0 here, or just wait for one
> of the later walks to error out?

I'm okay with both.

> 2) It seems unfortunate to walk the entire table when the first entry
> may give you the answer, but perhaps modern systems have only X2APIC
> entries, so we will typically have to walk the entire table anyway.

yeah. There are systems with invalid LAPIC entries first, and
acpi_parse_entries_array() doesn't support graceful early termination,
so we have to check all the entries.

> 
> Reviewed-and-tested-by: Jim Mattson <jmattson@...gle.com>

Thanks. I will submit the current version to keep your tags.

-rui

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ