[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <87y30l5jdo.fsf@yhuang-dev.intel.com>
Date: Fri, 26 Jul 2019 15:45:39 +0800
From: "Huang\, Ying" <ying.huang@...el.com>
To: Srikar Dronamraju <srikar@...ux.vnet.ibm.com>
Cc: Peter Zijlstra <peterz@...radead.org>,
Ingo Molnar <mingo@...nel.org>, <linux-mm@...ck.org>,
<linux-kernel@...r.kernel.org>, Rik van Riel <riel@...hat.com>,
Mel Gorman <mgorman@...e.de>, <jhladky@...hat.com>,
<lvenanci@...hat.com>, Andrew Morton <akpm@...ux-foundation.org>
Subject: Re: [PATCH RESEND] autonuma: Fix scan period updating
Hi, Srikar,
Srikar Dronamraju <srikar@...ux.vnet.ibm.com> writes:
> * Huang, Ying <ying.huang@...el.com> [2019-07-25 16:01:24]:
>
>> From: Huang Ying <ying.huang@...el.com>
>>
>> From the commit log and comments of commit 37ec97deb3a8 ("sched/numa:
>> Slow down scan rate if shared faults dominate"), the autonuma scan
>> period should be increased (scanning is slowed down) if the majority
>> of the page accesses are shared with other processes. But in current
>> code, the scan period will be decreased (scanning is speeded up) in
>> that situation.
>>
>> The commit log and comments make more sense. So this patch fixes the
>> code to make it match the commit log and comments. And this has been
>> verified via tracing the scan period changing and /proc/vmstat
>> numa_pte_updates counter when running a multi-threaded memory
>> accessing program (most memory areas are accessed by multiple
>> threads).
>>
>
> Lets split into 4 modes.
> More Local and Private Page Accesses:
> We definitely want to scan slowly i.e increase the scan window.
>
> More Local and Shared Page Accesses:
> We still want to scan slowly because we have consolidated and there is no
> point in scanning faster. So scan slowly + increase the scan window.
> (Do remember access on any active node counts as local!!!)
>
> More Remote + Private page Accesses:
> Most likely the Private accesses are going to be local accesses.
>
> In the unlikely event of the private accesses not being local, we should
> scan faster so that the memory and task consolidates.
>
> More Remote + Shared page Accesses: This means the workload has not
> consolidated and needs to scan faster. So we need to scan faster.
This sounds reasonable. But
lr_ratio < NUMA_PERIOD_THRESHOLD
doesn't indicate More Remote. If Local = Remote, it is also true. If
there are also more Shared, we should slow down the scanning. So, the
logic could be
if (lr_ratio >= NUMA_PERIOD_THRESHOLD)
slow down scanning
else if (sp_ratio >= NUMA_PERIOD_THRESHOLD) {
if (NUMA_PERIOD_SLOTS - lr_ratio >= NUMA_PERIOD_THRESHOLD)
speed up scanning
else
slow down scanning
} else
speed up scanning
This follows your idea better?
Best Regards,
Huang, Ying
> So I would think we should go back to before 37ec97deb3a8.
>
> i.e
>
> int slot = lr_ratio - NUMA_PERIOD_THRESHOLD;
>
> if (!slot)
> slot = 1;
> diff = slot * period_slot;
>
>
> No?
>
>> Fixes: 37ec97deb3a8 ("sched/numa: Slow down scan rate if shared faults dominate")
>> Signed-off-by: "Huang, Ying" <ying.huang@...el.com>
>> Cc: Rik van Riel <riel@...hat.com>
>> Cc: Peter Zijlstra (Intel) <peterz@...radead.org>
>> Cc: Mel Gorman <mgorman@...e.de>
>> Cc: jhladky@...hat.com
>> Cc: lvenanci@...hat.com
>> Cc: Ingo Molnar <mingo@...nel.org>
>> Cc: Andrew Morton <akpm@...ux-foundation.org>
>> ---
>> kernel/sched/fair.c | 20 ++++++++++----------
>> 1 file changed, 10 insertions(+), 10 deletions(-)
>>
>> diff --git a/kernel/sched/fair.c b/kernel/sched/fair.c
>> index 036be95a87e9..468a1c5038b2 100644
>> --- a/kernel/sched/fair.c
>> +++ b/kernel/sched/fair.c
>> @@ -1940,7 +1940,7 @@ static void update_task_scan_period(struct task_struct *p,
>> unsigned long shared, unsigned long private)
>> {
>> unsigned int period_slot;
>> - int lr_ratio, ps_ratio;
>> + int lr_ratio, sp_ratio;
>> int diff;
>>
>> unsigned long remote = p->numa_faults_locality[0];
>> @@ -1971,22 +1971,22 @@ static void update_task_scan_period(struct task_struct *p,
>> */
>> period_slot = DIV_ROUND_UP(p->numa_scan_period, NUMA_PERIOD_SLOTS);
>> lr_ratio = (local * NUMA_PERIOD_SLOTS) / (local + remote);
>> - ps_ratio = (private * NUMA_PERIOD_SLOTS) / (private + shared);
>> + sp_ratio = (shared * NUMA_PERIOD_SLOTS) / (private + shared);
>>
>> - if (ps_ratio >= NUMA_PERIOD_THRESHOLD) {
>> + if (sp_ratio >= NUMA_PERIOD_THRESHOLD) {
>> /*
>> - * Most memory accesses are local. There is no need to
>> - * do fast NUMA scanning, since memory is already local.
>> + * Most memory accesses are shared with other tasks.
>> + * There is no point in continuing fast NUMA scanning,
>> + * since other tasks may just move the memory elsewhere.
>
> With this change, I would expect that with Shared page accesses,
> consolidation to take a hit.
>
>> */
>> - int slot = ps_ratio - NUMA_PERIOD_THRESHOLD;
>> + int slot = sp_ratio - NUMA_PERIOD_THRESHOLD;
>> if (!slot)
>> slot = 1;
>> diff = slot * period_slot;
>> } else if (lr_ratio >= NUMA_PERIOD_THRESHOLD) {
>> /*
>> - * Most memory accesses are shared with other tasks.
>> - * There is no point in continuing fast NUMA scanning,
>> - * since other tasks may just move the memory elsewhere.
>> + * Most memory accesses are local. There is no need to
>> + * do fast NUMA scanning, since memory is already local.
>
> Comment wise this make sense.
>
>> */
>> int slot = lr_ratio - NUMA_PERIOD_THRESHOLD;
>> if (!slot)
>> @@ -1998,7 +1998,7 @@ static void update_task_scan_period(struct task_struct *p,
>> * yet they are not on the local NUMA node. Speed up
>> * NUMA scanning to get the memory moved over.
>> */
>> - int ratio = max(lr_ratio, ps_ratio);
>> + int ratio = max(lr_ratio, sp_ratio);
>> diff = -(NUMA_PERIOD_THRESHOLD - ratio) * period_slot;
>> }
>>
>> --
>> 2.20.1
>>
Powered by blists - more mailing lists