lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <370f7851-98b9-5812-7e3d-fea8053fb82c@arm.com>
Date:   Wed, 16 Feb 2022 10:54:21 +0530
From:   Anshuman Khandual <anshuman.khandual@....com>
To:     Alistair Popple <apopple@...dia.com>
Cc:     akpm@...ux-foundation.org, jhubbard@...dia.com, linux-mm@...ck.org,
        linux-kernel@...r.kernel.org, ziy@...dia.com
Subject: Re: [PATCH] mm/pages_alloc.c: Don't create ZONE_MOVABLE beyond the
 end of a node



On 2/15/22 10:46 AM, Alistair Popple wrote:
> Anshuman Khandual <anshuman.khandual@....com> writes:
> 
>> Hi Alistair,
>>
>> On 2/15/22 8:28 AM, Alistair Popple wrote:
>>> ZONE_MOVABLE uses the remaining memory in each node. It's starting pfn
>>> is also aligned to MAX_ORDER_NR_PAGES. It is possible for the remaining
>>> memory in a node to be less than MAX_ORDER_NR_PAGES, meaning there is
>>> not enough room for ZONE_MOVABLE on that node.
>>
>> How plausible is this scenario on normal systems ?
> 
> Probably not very. I happened to run into this on my development/test x86 VM
> which has 8GB and was booted with `numa=fake=4 kernelcore=60%` but in theory I
> guess any system that has a node with less than MAX_ORDER_NR_PAGES left over for
> ZONE_MOVABLE may be susceptible.
> 
> This was the RAM map:
> 
> [    0.000000] BIOS-provided physical RAM map:
> [    0.000000] BIOS-e820: [mem 0x0000000000000000-0x000000000009fbff] usable
> [    0.000000] BIOS-e820: [mem 0x000000000009fc00-0x000000000009ffff] reserved
> [    0.000000] BIOS-e820: [mem 0x00000000000f0000-0x00000000000fffff] reserved
> [    0.000000] BIOS-e820: [mem 0x0000000000100000-0x000000007ffddfff] usable
> [    0.000000] BIOS-e820: [mem 0x000000007ffde000-0x000000007fffffff] reserved
> [    0.000000] BIOS-e820: [mem 0x00000000b0000000-0x00000000bfffffff] reserved
> [    0.000000] BIOS-e820: [mem 0x00000000fed1c000-0x00000000fed1ffff] reserved
> [    0.000000] BIOS-e820: [mem 0x00000000feffc000-0x00000000feffffff] reserved
> [    0.000000] BIOS-e820: [mem 0x00000000fffc0000-0x00000000ffffffff] reserved
> [    0.000000] BIOS-e820: [mem 0x0000000100000000-0x000000027fffffff] usable
> 
> [...]
> 
> [    0.065897] Early memory node ranges
> [    0.065898]   node   0: [mem 0x0000000000001000-0x000000000009efff]
> [    0.065900]   node   0: [mem 0x0000000000100000-0x000000007ffddfff]
> [    0.065902]   node   1: [mem 0x0000000100000000-0x000000017fffffff]
> [    0.065904]   node   2: [mem 0x0000000180000000-0x00000001ffffffff]
> [    0.065906]   node   3: [mem 0x0000000200000000-0x000000027fffffff]
> 
> Note the reserved range from 0x000000007ffde000 to 0x000000007fffffff resulting
> in node-0 ending at 0x000000007ffddfff.
> 
>> Should not the node always contain MAX_ORDER_NR_PAGES aligned pages ? Also all
>> zones which get created from that node should also be MAX_ORDER_NR_PAGES
>> aligned ?
> 
> I'm not sure why that would be case given page size and MAX_ORDER_NR_PAGES can
> be set via a kernel configuration parameter. Obviously it wasn't the case here

I assumed that in general that would be the case.

> or this situation would not arise. That said I don't know this code well, and
> this was where I decided to stop shaving this yak so it's possible there is an
> even deeper underlying issue.
> 
> Either way I don't *think* the fix should introduce any problems as it shouldn't
> do anything unless you were going to hit this issue anyway (which took sometime
> to track down as the cause wasn't obvious).

Fair enough.

> 
>> I am just curious how a node could end up being like this.
> 
> - Anshuman
> 

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ