[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <0f038010-ed83-55bb-70a5-24f5c6d68666@gmail.com>
Date: Wed, 12 Oct 2022 16:57:53 -0700
From: Doug Berger <opendmb@...il.com>
To: David Hildenbrand <david@...hat.com>,
Andrew Morton <akpm@...ux-foundation.org>
Cc: Jonathan Corbet <corbet@....net>, Mike Rapoport <rppt@...nel.org>,
Borislav Petkov <bp@...e.de>,
"Paul E. McKenney" <paulmck@...nel.org>,
Neeraj Upadhyay <quic_neeraju@...cinc.com>,
Randy Dunlap <rdunlap@...radead.org>,
Damien Le Moal <damien.lemoal@...nsource.wdc.com>,
Muchun Song <songmuchun@...edance.com>,
KOSAKI Motohiro <kosaki.motohiro@...fujitsu.com>,
Mel Gorman <mgorman@...e.de>,
Mike Kravetz <mike.kravetz@...cle.com>,
Florian Fainelli <f.fainelli@...il.com>,
Oscar Salvador <osalvador@...e.de>,
Michal Hocko <mhocko@...e.com>,
Joonsoo Kim <iamjoonsoo.kim@....com>,
linux-doc@...r.kernel.org, linux-kernel@...r.kernel.org,
linux-mm@...ck.org
Subject: Re: [PATCH v2 2/9] mm/vmstat: show start_pfn when zone spans pages
On 10/5/2022 11:09 AM, David Hildenbrand wrote:
> On 01.10.22 03:28, Doug Berger wrote:
>> On 9/29/2022 1:15 AM, David Hildenbrand wrote:
>>> On 29.09.22 00:32, Doug Berger wrote:
>>>> A zone that overlaps with another zone may span a range of pages
>>>> that are not present. In this case, displaying the start_pfn of
>>>> the zone allows the zone page range to be identified.
>>>>
>>>
>>> I don't understand the intention here.
>>>
>>> "/* If unpopulated, no other information is useful */"
>>>
>>> Why would the start pfn be of any use here?
>>>
>>> What is the user visible impact without that change?
>> Yes, this is very subtle. I only caught it while testing some
>> pathological cases.
>>
>> If you take the example system:
>> The 7278 device has four ARMv8 CPU cores in an SMP cluster and two
>> memory controllers (MEMCs). Each MEMC is capable of controlling up to
>> 8GB of DRAM. An example 7278 system might have 1GB on each controller,
>> so an arm64 kernel might see 1GB on MEMC0 at 0x40000000-0x7FFFFFFF and
>> 1GB on MEMC1 at 0x300000000-0x33FFFFFFF.
>>
>
> Okay, thanks. You should make it clearer in the patch description --
> especially how this relates to DMB. Having that said, I still have to
> digest your examples:
>
>> Placing a DMB on MEMC0 with 'movablecore=256M@...0000000' will lead to
>> the ZONE_MOVABLE zone spanning from 0x70000000-0x33fffffff and the
>> ZONE_NORMAL zone spanning from 0x300000000-0x33fffffff.
>
> Why is ZONE_MOVABLE spanning more than 256M? It should span
>
> 0x70000000-0x80000000
>
> Or what am I missing?
I was working from the notion that the classic 'movablecore'
implementation keeps the ZONE_MOVABLE zone the last zone on System RAM
so it always spans the last page on the node (i.e. 0x33ffff000). My
implementation moves the start of ZONE_MOVABLE up to the lowest page of
any defined DMBs on the node.
I see that memory hotplug does not behave this way, which is probably
more intuitive (though less consistent with the classic zone layout). I
could attempt to change this in a v3 if desired.
>
>>
>> If instead you specified 'movablecore=256M@...0000000,512M' you would
>> get the same ZONE_MOVABLE span, but the ZONE_NORMAL would now span
>> 0x300000000-0x32fffffff. The requested 512M of movablecore would be
>> divided into a 256MB DMB at 0x70000000 and a 256MB "classic" movable
>> zone start would be displayed in the bootlog as:
>> [ 0.000000] Movable zone start for each node
>> [ 0.000000] Node 0: 0x000000330000000
>
>
> Okay, so that's the movable zone range excluding DMB.
>
>>
>> Finally, if you specified the pathological
>> 'movablecore=256M@...0000000,1G@...' you would still have the same
>> ZONE_MOVABLE span, and the ZONE_NORMAL span would go back to
>> 0x300000000-0x33fffffff. However, because the second DMB (1G@12G)
>> completely overlaps the ZONE_NORMAL there would be no pages present in
>> ZONE_NORMAL and /proc/zoneinfo would report ZONE_NORMAL 'spanned
>> 262144', but not where those pages are. This commit adds the 'start_pfn'
>> back to the /proc/zoneinfo for ZONE_NORMAL so the span has context.
>
> ... but why? If there are no pages present, there is no ZONE_NORMAL we
> care about. The zone span should be 0. Does this maybe rather indicate
> that there is a zone span processing issue in your DMB implementation?
My implementation uses the zones created by the classic 'movablecore'
behavior and relocates the pages within DMBs. In this case the
ZONE_NORMAL still has a span which gets output but no present pages so
the output didn't show where the zone was without this patch. This is a
convenience to avoid adding zone resizing and destruction logic outside
of memory hotplug support, but I could attempt to add that code in a v3
if desired.
>
> Special-casing zones based on DMBs feels wrong. But most probably I am
> missing something important :)
>
Thanks for making me aware of your confusion so I can attempt to make it
clearer.
-Doug
Powered by blists - more mailing lists