lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <CAFNjLiWK-h_OSP8Q9JyoLfsoDXo=ih27AFbVVXqzzwNkPVQKTg@mail.gmail.com>
Date:   Fri, 1 Dec 2023 15:32:31 -0800
From:   John Sperbeck <jsperbeck@...gle.com>
To:     "Zhang, Rui" <rui.zhang@...el.com>
Cc:     "tip-bot2@...utronix.de" <tip-bot2@...utronix.de>,
        "tglx@...utronix.de" <tglx@...utronix.de>,
        "peterz@...radead.org" <peterz@...radead.org>,
        "linux-kernel@...r.kernel.org" <linux-kernel@...r.kernel.org>,
        "linux-tip-commits@...r.kernel.org" 
        <linux-tip-commits@...r.kernel.org>,
        "x86@...nel.org" <x86@...nel.org>
Subject: Re: [tip: x86/urgent] x86/acpi: Ignore invalid x2APIC entries

On Fri, Dec 1, 2023 at 12:32 AM Zhang, Rui <rui.zhang@...el.com> wrote:
>
> On Thu, 2023-11-23 at 20:50 +0800, Exchange Online wrote:
> > Hi, John,
> >
> > Thanks for catching this issue.
> >
> > On Wed, 2023-11-22 at 22:19 +0000, John Sperbeck wrote:
> > > I have a platform with both LOCAL_APIC and LOCAL_X2APIC entries for
> > > each CPU.  However, the ids for the LOCAL_APIC entries are all
> > > invalid ids of 255, so they have always been skipped in
> > > acpi_parse_lapic()
> > > by this code from f3bf1dbe64b6 ("x86/acpi: Prevent LAPIC id 0xff
> > > from
> > > being
> > > accounted"):
> > >
> > >     /* Ignore invalid ID */
> > >     if (processor->id == 0xff)
> > >             return 0;
> > >
> > > With the change in this thread, the return value of 0 means that
> > > the
> > > 'count' variable in acpi_parse_entries_array() is incremented.  The
> > > positive return value means that 'has_lapic_cpus' is set, even
> > > though
> > > no entries were actually matched.
> >
> > So in acpi_parse_madt_lapic_entries, without this patch,
> > madt_proc[0].count is a positive value on this platform, right?
> >
> > This sounds like a potential issue because the following checks to
> > fall
> > back to MPS mode can also break. (If all LOCAL_APIC entries have
> > apic_id 0xff and all LOCAL_X2APIC entries have apic_id 0xffffffff)
> >
> > >   Then, when the MADT is iterated
> > > with acpi_parse_x2apic(), the x2apic entries with ids less than 255
> > > are skipped and most of my CPUs aren't recognized.
> > >
> > > I think the original version of this change was okay for this case
> > > in
> > > https://lore.kernel.org/lkml/87pm4bp54z.ffs@tglx/T/
> >
> > Yeah.
> >
> > But if we want to fix the potential issue above, we need to do
> > something more.
> >
> > Say we can still use acpi_table_parse_entries_array() and convert
> > acpi_parse_lapic()/acpi_parse_x2apic() to
> > acpi_subtable_proc.handler_arg and save the real valid entries via
> > the
> > parameter.
> >
> > or can we just use num_processors & disabled_cpus to check if there
> > is
> > any CPU probed when parsing LOCAL_APIC/LOCAL_X2APIC entires?
> >
>
> Hi, John,
>
> As a quick fix, I'm not going to fix the "potential issue" describes
> above because we have not seen a real problem caused by this yet.
>
> Can you please try the below patch to confirm if the problem is gone on
> your system?
> This patch falls back to the previous way as sent at
> https://lore.kernel.org/lkml/87pm4bp54z.ffs@tglx/T/
>
> thanks,
> rui
>
> From bdb45e241b4fea8a12b958e490979e96b064e43d Mon Sep 17 00:00:00 2001
> From: Zhang Rui <rui.zhang@...el.com>
> Date: Fri, 1 Dec 2023 15:06:34 +0800
> Subject: [PATCH] x86/acpi: Do strict X2APIC ID check only when an enabled CPU
>  is enumerated via LAPIC
>
> Commit 8e9c42d776d6 ("x86/acpi: Ignore invalid x2APIC entries") does
> strict X2APIC ID check if LAPIC contains valid CPUs by checking the
> acpi_table_parse_madt() return value.
>
> This is wrong because acpi_table_parse_madt() return value only
> represents the number of legal entries parsed. For example, LAPIC entry
> with LAPIC ID 0xff is counted as a legal entry, but it doesn't describe
> a valid CPU.
>
> This causes issues on a system which has 0xff LAPIC ID in all LAPIC
> entries. Because the code does strict X2APIC IDs check and ignores most
> of the CPUs in X2APIC entries.
>
> Fix the problem by doing strict X2APIC ID check less aggressively, say
> only when an enabled CPU is enumerated via LAPIC.
>
> Fixes: 8e9c42d776d6 ("x86/acpi: Ignore invalid x2APIC entries")
> Link: https://lore.kernel.org/all/20231122221947.781812-1-jsperbeck@google.com/
> Reported-by: John Sperbeck <jsperbeck@...gle.com>
> Signed-off-by: Zhang Rui <rui.zhang@...el.com>
> ---
>  arch/x86/kernel/acpi/boot.c | 12 ++++++++----
>  1 file changed, 8 insertions(+), 4 deletions(-)
>
> diff --git a/arch/x86/kernel/acpi/boot.c b/arch/x86/kernel/acpi/boot.c
> index 1a0dd80d81ac..8cc566ce486a 100644
> --- a/arch/x86/kernel/acpi/boot.c
> +++ b/arch/x86/kernel/acpi/boot.c
> @@ -266,6 +266,7 @@ static int __init
>  acpi_parse_lapic(union acpi_subtable_headers * header, const unsigned long end)
>  {
>         struct acpi_madt_local_apic *processor = NULL;
> +       int cpu;
>
>         processor = (struct acpi_madt_local_apic *)header;
>
> @@ -289,9 +290,13 @@ acpi_parse_lapic(union acpi_subtable_headers * header, const unsigned long end)
>          * to not preallocating memory for all NR_CPUS
>          * when we use CPU hotplug.
>          */
> -       acpi_register_lapic(processor->id,      /* APIC ID */
> -                           processor->processor_id, /* ACPI ID */
> -                           processor->lapic_flags & ACPI_MADT_ENABLED);
> +       cpu = acpi_register_lapic(processor->id,        /* APIC ID */
> +                                 processor->processor_id, /* ACPI ID */
> +                                 processor->lapic_flags & ACPI_MADT_ENABLED);
> +
> +       /* Do strict X2APIC ID check only when an enabled CPU is enumerated via LAPIC */
> +       if (cpu >= 0 )
> +               has_lapic_cpus = true;
>
>         return 0;
>  }
> @@ -1134,7 +1139,6 @@ static int __init acpi_parse_madt_lapic_entries(void)
>         if (!count) {
>                 count = acpi_table_parse_madt(ACPI_MADT_TYPE_LOCAL_APIC,
>                                         acpi_parse_lapic, MAX_LOCAL_APIC);
> -               has_lapic_cpus = count > 0;
>                 x2count = acpi_table_parse_madt(ACPI_MADT_TYPE_LOCAL_X2APIC,
>                                         acpi_parse_x2apic, MAX_LOCAL_APIC);
>         }
> --
> 2.34.1
>

Yes, with that patch, the problem is gone on my system.  All of the
CPUs are recognized and active.

Before the patch (only one CPU active):

    # cat /proc/cpuinfo | grep '^processor' | wc -l
    1

With the patch (all CPUs active):

    oxco8:~# cat /proc/cpuinfo | grep '^processor' | wc -l
    112

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ