[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <ec69adf2-4eb5-4e38-804f-804d1dde0e84@oracle.com>
Date: Thu, 24 Apr 2025 00:46:34 -0700
From: Libo Chen <libo.chen@...cle.com>
To: Venkat Rao Bagalkote <venkat88@...ux.ibm.com>, akpm@...ux-foundation.org,
rostedt@...dmis.org, peterz@...radead.org, mgorman@...e.de,
mingo@...hat.com, juri.lelli@...hat.com, vincent.guittot@...aro.org,
tj@...nel.org, llong@...hat.com
Cc: sraithal@....com, kprateek.nayak@....com, raghavendra.kt@....com,
yu.c.chen@...el.com, tim.c.chen@...el.com, vineethr@...ux.ibm.com,
chris.hyser@...cle.com, daniel.m.jordan@...cle.com,
lorenzo.stoakes@...cle.com, mkoutny@...e.com, linux-mm@...ck.org,
cgroups@...r.kernel.org, linux-kernel@...r.kernel.org
Subject: Re: [PATCH v5 0/2] sched/numa: Skip VMA scanning on memory pinned to
one NUMA node via cpuset.mems
On 4/24/25 00:05, Venkat Rao Bagalkote wrote:
>
> On 24/04/25 8:15 am, Libo Chen wrote:
>> v1->v2:
>> 1. add perf improvment numbers in commit log. Yet to find perf diff on
>> will-it-scale, so not included here. Plan to run more workloads.
>> 2. add tracepoint.
>> 3. To peterz's comment, this will make it impossible to attract tasks to
>> those memory just like other VMA skippings. This is the current
>> implementation, I think we can improve that in the future, but at the
>> moment it's probabaly better to keep it consistent.
>>
>> v2->v3:
>> 1. add enable_cpuset() based on Mel's suggestion but again I think it's
>> redundant.
>> 2. print out nodemask with %*p.. format in the tracepoint.
>>
>> v3->v4:
>> 1. fix an unsafe dereference of a pointer to content not on ring buffer,
>> namely mem_allowed_ptr in the tracepoint.
>>
>> v4->v5:
>> 1. add BUILD_BUG_ON() in TP_fast_assign() to guard against future
>> changes (particularly in size) in nodemask_t.
>>
>> Libo Chen (2):
>> sched/numa: Skip VMA scanning on memory pinned to one NUMA node via
>> cpuset.mems
>> sched/numa: Add tracepoint that tracks the skipping of numa balancing
>> due to cpuset memory pinning
>>
>> include/trace/events/sched.h | 33 +++++++++++++++++++++++++++++++++
>> kernel/sched/fair.c | 9 +++++++++
>> 2 files changed, 42 insertions(+)
>>
> Hello Libo,
>
>
> For some reason I am not able to apply this patch. I am trying to test the boot warning[1].
>
> I am trying to apply on top of next-20250423. Below is the error. Am I missing anything?
>
> [1]: https://urldefense.com/v3/__https://lore.kernel.org/all/20250422205740.02c4893a@canb.auug.org.au/__;!!ACWV5N9M2RV99hQ!IQpY9WDL1O3ppDekb1PpaTYJ98ehOXL6dNIkx02MPN84bCieT18zCh7WSouHctEGpwG2rtpZB42l7b5mkMFb$
> Error:
>
> git am -i v5_20250423_libo_chen_sched_numa_skip_vma_scanning_on_memory_pinned_to_one_numa_node_via_cpuset_mems.mbx
> Commit Body is:
> --------------------------
> sched/numa: Skip VMA scanning on memory pinned to one NUMA node via cpuset.mems
>
> When the memory of the current task is pinned to one NUMA node by cgroup,
> there is no point in continuing the rest of VMA scanning and hinting page
> faults as they will just be overhead. With this change, there will be no
> more unnecessary PTE updates or page faults in this scenario.
>
> We have seen up to a 6x improvement on a typical java workload running on
> VMs with memory and CPU pinned to one NUMA node via cpuset in a two-socket
> AARCH64 system. With the same pinning, on a 18-cores-per-socket Intel
> platform, we have seen 20% improvment in a microbench that creates a
> 30-vCPU selftest KVM guest with 4GB memory, where each vCPU reads 4KB
> pages in a fixed number of loops.
>
> Signed-off-by: Libo Chen <libo.chen@...cle.com>
> Tested-by: Chen Yu <yu.c.chen@...el.com>
> Tested-by: K Prateek Nayak <kprateek.nayak@....com>
> --------------------------
> Apply? [y]es/[n]o/[e]dit/[v]iew patch/[a]ccept all: a
> Applying: sched/numa: Skip VMA scanning on memory pinned to one NUMA node via cpuset.mems
> error: patch failed: kernel/sched/fair.c:3329
> error: kernel/sched/fair.c: patch does not apply
> Patch failed at 0001 sched/numa: Skip VMA scanning on memory pinned to one NUMA node via cpuset.mems
>
>
Hi Venkat,
I just did git am -i t.mbox on top of next-20250423, not sure why but the second patch was ahead of the
first patch in apply order, have you made sure the second patch was not applied before the first one?
- Libo
> Regards,
>
> Venkat.
>
>
>
Powered by blists - more mailing lists