[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <e9061a5c-bbef-e818-94f7-95e21a73a948@intel.com>
Date: Mon, 24 May 2021 08:52:02 -0700
From: Dave Hansen <dave.hansen@...el.com>
To: Mel Gorman <mgorman@...hsingularity.net>
Cc: Linux-MM <linux-mm@...ck.org>,
Dave Hansen <dave.hansen@...ux.intel.com>,
Matthew Wilcox <willy@...radead.org>,
Vlastimil Babka <vbabka@...e.cz>,
Michal Hocko <mhocko@...nel.org>,
Nicholas Piggin <npiggin@...il.com>,
LKML <linux-kernel@...r.kernel.org>
Subject: Re: [PATCH 3/6] mm/page_alloc: Adjust pcp->high after CPU hotplug
events
On 5/24/21 2:07 AM, Mel Gorman wrote:
> On Fri, May 21, 2021 at 03:13:35PM -0700, Dave Hansen wrote:
>> On 5/21/21 3:28 AM, Mel Gorman wrote:
>>> The PCP high watermark is based on the number of online CPUs so the
>>> watermarks must be adjusted during CPU hotplug. At the time of
>>> hot-remove, the number of online CPUs is already adjusted but during
>>> hot-add, a delta needs to be applied to update PCP to the correct
>>> value. After this patch is applied, the high watermarks are adjusted
>>> correctly.
>>>
>>> # grep high: /proc/zoneinfo | tail -1
>>> high: 649
>>> # echo 0 > /sys/devices/system/cpu/cpu4/online
>>> # grep high: /proc/zoneinfo | tail -1
>>> high: 664
>>> # echo 1 > /sys/devices/system/cpu/cpu4/online
>>> # grep high: /proc/zoneinfo | tail -1
>>> high: 649
>> This is actually a comment more about the previous patch, but it doesn't
>> really become apparent until the example above.
>>
>> In your example, you mentioned increased exit() performance by using
>> "vm.percpu_pagelist_fraction to increase the pcp->high value". That's
>> presumably because of the increased batching effects and fewer lock
>> acquisitions.
>>
> Yes
>
>> But, logically, doesn't that mean that, the more CPUs you have in a
>> node, the *higher* you want pcp->high to be? If we took this to the
>> extreme and had an absurd number of CPUs in a node, we could end up with
>> a too-small pcp->high value.
>>
> I see your point but I don't think increasing pcp->high for larger
> numbers of CPUs is the right answer because then reclaim can be
> triggered simply because too many PCPs have pages.
>
> To address your point requires much deeper surgery.
...
> There is value to doing something like this but it's beyond what this
> series is trying to do and doing the work without introducing regressions
> would be very difficult.
Agreed, such a solution is outside of the scope of what this set is
trying to do.
It would be nice to touch on this counter-intuitive property in the
changelog, and *maybe* add a WARN_ON_ONCE() if we hit an edge case.
Maybe WARN_ON_ONCE() if pcp->high gets below pcp->batch*SOMETHING.
Powered by blists - more mailing lists