lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <87zgz1pmx4.mognet@arm.com>
Date:   Wed, 17 Mar 2021 20:04:07 +0000
From:   Valentin Schneider <valentin.schneider@....com>
To:     John Paul Adrian Glaubitz <glaubitz@...sik.fu-berlin.de>
Cc:     "Peter Zijlstra \(Intel\)" <peterz@...radead.org>,
        Linux Kernel Mailing List <linux-kernel@...r.kernel.org>,
        "linux-ia64\@vger.kernel.org" <linux-ia64@...r.kernel.org>,
        Sergei Trofimovich <slyfox@...too.org>,
        debian-ia64 <debian-ia64@...ts.debian.org>
Subject: Re: [PATCH 0/1] sched/topology: NUMA distance deduplication

On 17/03/21 20:47, John Paul Adrian Glaubitz wrote:
> Helo Valentin!
>
> On 3/17/21 8:36 PM, Valentin Schneider wrote:
>> I see ACPI in your boot logs, so I'm guessing you have a bogus SLIT table
>> (the ACPI table with node distances). You should be able to double check
>> this with something like:
>>
>> $ acpidump > acpi.dump
>> $ acpixtract -a acpi.dump
>> $ iasl -d *.dat
>> $ cat slit.dsl
>
> There does not seem to be a SLIT table in my firmware:
>
> root@...ndronach:~# acpidump > acpi.dump
> root@...ndronach:~# acpixtract -a acpi.dump
>
> Intel ACPI Component Architecture
> ACPI Binary Table Extraction Utility version 20200925
> Copyright (c) 2000 - 2020 Intel Corporation
>
> acpixtract(31194): unaligned access to 0x60000fffff9b3925, ip=0x4000000000003e91
>   SSDT -    3768 bytes written (0x00000EB8) - ssdt1.dat
> acpixtract(31194): unaligned access to 0x60000fffff9b3925, ip=0x4000000000003e00
> acpixtract(31194): unaligned access to 0x60000fffff9b3925, ip=0x4000000000003e91
>   SPCR -      80 bytes written (0x00000050) - spcr.dat
> acpixtract(31194): unaligned access to 0x60000fffff9b3925, ip=0x4000000000003e00
> acpixtract(31194): unaligned access to 0x60000fffff9b3925, ip=0x4000000000003e91
>   APIC -     200 bytes written (0x000000C8) - apic.dat
>   SSDT -    1110 bytes written (0x00000456) - ssdt2.dat
>   SSDT -     316 bytes written (0x0000013C) - ssdt3.dat
>   SPMI -      80 bytes written (0x00000050) - spmi.dat
>   DSDT -   58726 bytes written (0x0000E566) - dsdt.dat
>   SSDT -     312 bytes written (0x00000138) - ssdt4.dat
>   SSDT -    2150 bytes written (0x00000866) - ssdt5.dat
>   SSDT -     316 bytes written (0x0000013C) - ssdt6.dat
>   SSDT -    3768 bytes written (0x00000EB8) - ssdt7.dat
>   FACP -     244 bytes written (0x000000F4) - facp.dat
>   SSDT -    1203 bytes written (0x000004B3) - ssdt8.dat
>   CPEP -      52 bytes written (0x00000034) - cpep.dat
>   SSDT -     316 bytes written (0x0000013C) - ssdt9.dat
>   DBGP -      52 bytes written (0x00000034) - dbgp.dat
>   SSDT -    3768 bytes written (0x00000EB8) - ssdt10.dat
>   FACS -      64 bytes written (0x00000040) - facs.dat
> root@...ndronach:~#
>
> root@...ndronach:~# ls *.dsl *.dat
> apic.dat  cpep.dsl  dsdt.dat  facp.dsl  spcr.dat  spmi.dsl    ssdt1.dat  ssdt2.dsl  ssdt4.dat  ssdt5.dsl  ssdt7.dat  ssdt8.dsl
> apic.dsl  dbgp.dat  dsdt.dsl  facs.dat  spcr.dsl  ssdt10.dat  ssdt1.dsl  ssdt3.dat  ssdt4.dsl  ssdt6.dat  ssdt7.dsl  ssdt9.dat
> cpep.dat  dbgp.dsl  facp.dat  facs.dsl  spmi.dat  ssdt10.dsl  ssdt2.dat  ssdt3.dsl  ssdt5.dat  ssdt6.dsl  ssdt8.dat  ssdt9.dsl
> root@...ndronach:~#
>

Huh, then this might be some initialization fail that leaves nr_node_ids to
MAX_NUMNODES, which must be 256 in your case (NODES_SHIFT==8). Devicetree
can provide node distances, but something tells me you're not using that :-)

>> a) Complain to your hardware vendor to have them fix the table and ship a
>>    firmware fix
>
> The hardware is probably too old for the vendor to care about fixing it.
>

Indeed, I only realized that after googling your machine

>> b) Fix the ACPI table yourself - I've been told it's doable for *some* of
>>    them, but I've never done that myself
>> c) Compile your kernel with CONFIG_NUMA=n, as AFAICT you only actually have
>>    a single node
>> d) Ignore the warning
>>
>>
>> c) is clearly not ideal if you want to use a somewhat generic kernel image
>> on a wide host of machines; d) is also a bit yucky...
>
> Shouldn't the kernel be able to cope with quirky hardware? From what I remember in the past,
> ACPI tables used to be broken quite a lot and the kernel contained workarounds for such cases,
> didn't it?
>

Technically it *is* coping with it, it's just dumping the entire NUMA
distance matrix in the process... Let me see if I can't figure out why your
system doesn't end up with nr_node_ids=1.

> Adrian
>
> --
>  .''`.  John Paul Adrian Glaubitz
> : :' :  Debian Developer - glaubitz@...ian.org
> `. `'   Freie Universitaet Berlin - glaubitz@...sik.fu-berlin.de
>   `-    GPG: 62FF 8A75 84E0 2956 9546  0006 7426 3B37 F5B5 F913

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ