lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Date:   Tue, 15 Feb 2022 16:16:28 +1100
From:   Alistair Popple <apopple@...dia.com>
To:     Anshuman Khandual <anshuman.khandual@....com>
Cc:     akpm@...ux-foundation.org, jhubbard@...dia.com, linux-mm@...ck.org,
        linux-kernel@...r.kernel.org, ziy@...dia.com
Subject: Re: [PATCH] mm/pages_alloc.c: Don't create ZONE_MOVABLE beyond the
 end of a node

Anshuman Khandual <anshuman.khandual@....com> writes:

> Hi Alistair,
>
> On 2/15/22 8:28 AM, Alistair Popple wrote:
>> ZONE_MOVABLE uses the remaining memory in each node. It's starting pfn
>> is also aligned to MAX_ORDER_NR_PAGES. It is possible for the remaining
>> memory in a node to be less than MAX_ORDER_NR_PAGES, meaning there is
>> not enough room for ZONE_MOVABLE on that node.
>
> How plausible is this scenario on normal systems ?

Probably not very. I happened to run into this on my development/test x86 VM
which has 8GB and was booted with `numa=fake=4 kernelcore=60%` but in theory I
guess any system that has a node with less than MAX_ORDER_NR_PAGES left over for
ZONE_MOVABLE may be susceptible.

This was the RAM map:

[    0.000000] BIOS-provided physical RAM map:
[    0.000000] BIOS-e820: [mem 0x0000000000000000-0x000000000009fbff] usable
[    0.000000] BIOS-e820: [mem 0x000000000009fc00-0x000000000009ffff] reserved
[    0.000000] BIOS-e820: [mem 0x00000000000f0000-0x00000000000fffff] reserved
[    0.000000] BIOS-e820: [mem 0x0000000000100000-0x000000007ffddfff] usable
[    0.000000] BIOS-e820: [mem 0x000000007ffde000-0x000000007fffffff] reserved
[    0.000000] BIOS-e820: [mem 0x00000000b0000000-0x00000000bfffffff] reserved
[    0.000000] BIOS-e820: [mem 0x00000000fed1c000-0x00000000fed1ffff] reserved
[    0.000000] BIOS-e820: [mem 0x00000000feffc000-0x00000000feffffff] reserved
[    0.000000] BIOS-e820: [mem 0x00000000fffc0000-0x00000000ffffffff] reserved
[    0.000000] BIOS-e820: [mem 0x0000000100000000-0x000000027fffffff] usable

[...]

[    0.065897] Early memory node ranges
[    0.065898]   node   0: [mem 0x0000000000001000-0x000000000009efff]
[    0.065900]   node   0: [mem 0x0000000000100000-0x000000007ffddfff]
[    0.065902]   node   1: [mem 0x0000000100000000-0x000000017fffffff]
[    0.065904]   node   2: [mem 0x0000000180000000-0x00000001ffffffff]
[    0.065906]   node   3: [mem 0x0000000200000000-0x000000027fffffff]

Note the reserved range from 0x000000007ffde000 to 0x000000007fffffff resulting
in node-0 ending at 0x000000007ffddfff.

> Should not the node always contain MAX_ORDER_NR_PAGES aligned pages ? Also all
> zones which get created from that node should also be MAX_ORDER_NR_PAGES
> aligned ?

I'm not sure why that would be case given page size and MAX_ORDER_NR_PAGES can
be set via a kernel configuration parameter. Obviously it wasn't the case here
or this situation would not arise. That said I don't know this code well, and
this was where I decided to stop shaving this yak so it's possible there is an
even deeper underlying issue.

Either way I don't *think* the fix should introduce any problems as it shouldn't
do anything unless you were going to hit this issue anyway (which took sometime
to track down as the cause wasn't obvious).

> I am just curious how a node could end up being like this.

- Anshuman

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ