lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20200703105944.GS18446@dhcp22.suse.cz>
Date:   Fri, 3 Jul 2020 12:59:44 +0200
From:   Michal Hocko <mhocko@...nel.org>
To:     Michal Suchánek <msuchanek@...e.de>
Cc:     David Hildenbrand <david@...hat.com>,
        Gautham R Shenoy <ego@...ux.vnet.ibm.com>,
        Srikar Dronamraju <srikar@...ux.vnet.ibm.com>,
        Linus Torvalds <torvalds@...ux-foundation.org>,
        linux-kernel@...r.kernel.org, linux-mm@...ck.org,
        Satheesh Rajendran <sathnaga@...ux.vnet.ibm.com>,
        Mel Gorman <mgorman@...e.de>,
        "Kirill A. Shutemov" <kirill@...temov.name>,
        Andrew Morton <akpm@...ux-foundation.org>,
        linuxppc-dev@...ts.ozlabs.org, Christopher Lameter <cl@...ux.com>,
        Vlastimil Babka <vbabka@...e.cz>,
        Andi Kleen <ak@...ux.intel.com>
Subject: Re: [PATCH v5 3/3] mm/page_alloc: Keep memoryless cpuless node 0
 offline

On Fri 03-07-20 11:24:17, Michal Hocko wrote:
> [Cc Andi]
> 
> On Fri 03-07-20 11:10:01, Michal Suchanek wrote:
> > On Wed, Jul 01, 2020 at 02:21:10PM +0200, Michal Hocko wrote:
> > > On Wed 01-07-20 13:30:57, David Hildenbrand wrote:
> [...]
> > > > Yep, looks like it.
> > > > 
> > > > [    0.009726] SRAT: PXM 1 -> APIC 0x00 -> Node 0
> > > > [    0.009727] SRAT: PXM 1 -> APIC 0x01 -> Node 0
> > > > [    0.009727] SRAT: PXM 1 -> APIC 0x02 -> Node 0
> > > > [    0.009728] SRAT: PXM 1 -> APIC 0x03 -> Node 0
> > > > [    0.009731] ACPI: SRAT: Node 0 PXM 1 [mem 0x00000000-0x0009ffff]
> > > > [    0.009732] ACPI: SRAT: Node 0 PXM 1 [mem 0x00100000-0xbfffffff]
> > > > [    0.009733] ACPI: SRAT: Node 0 PXM 1 [mem 0x100000000-0x13fffffff]
> > > 
> > > This begs a question whether ppc can do the same thing?
> > Or x86 stop doing it so that you can see on what node you are running?
> > 
> > What's the point of this indirection other than another way of avoiding
> > empty node 0?
> 
> Honestly, I do not have any idea. I've traced it down to
> Author: Andi Kleen <ak@...e.de>
> Date:   Tue Jan 11 15:35:48 2005 -0800
> 
>     [PATCH] x86_64: Fix ACPI SRAT NUMA parsing
> 
>     Fix fallout from the recent nodemask_t changes. The node ids assigned
>     in the SRAT parser were off by one.
> 
>     I added a new first_unset_node() function to nodemask.h to allocate
>     IDs sanely.
> 
>     Signed-off-by: Andi Kleen <ak@...e.de>
>     Signed-off-by: Linus Torvalds <torvalds@...l.org>
> 
> which doesn't really tell all that much. The historical baggage and a
> long term behavior which is not really trivial to fix I suspect.

Thinking about this some more, this logic makes some sense afterall.
Especially in the world without memory hotplug which was very likely the
case back then. It is much better to have compact node mask rather than
sparse one. After all node numbers shouldn't really matter as long as
you have a clear mapping to the HW. I am not sure we export that
information (except for the kernel ring buffer) though.

The memory hotplug changes that somehow because you can hotremove numa
nodes and therefore make the nodemask sparse but that is not a common
case. I am not sure what would happen if a completely new node was added
and its corresponding node was already used by the renumbered one
though. It would likely conflate the two I am afraid. But I am not sure
this is really possible with x86 and a lack of a bug report would
suggest that nobody is doing that at least.

-- 
Michal Hocko
SUSE Labs

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ