[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <c6afc71d-3d8e-44d7-b4be-28f6a8c80f5d@linux.ibm.com>
Date: Thu, 24 Apr 2025 15:17:12 +0530
From: Venkat Rao Bagalkote <venkat88@...ux.ibm.com>
To: Libo Chen <libo.chen@...cle.com>, akpm@...ux-foundation.org,
rostedt@...dmis.org, peterz@...radead.org, mgorman@...e.de,
mingo@...hat.com, juri.lelli@...hat.com, vincent.guittot@...aro.org,
tj@...nel.org, llong@...hat.com
Cc: sraithal@....com, kprateek.nayak@....com, raghavendra.kt@....com,
yu.c.chen@...el.com, tim.c.chen@...el.com, vineethr@...ux.ibm.com,
chris.hyser@...cle.com, daniel.m.jordan@...cle.com,
lorenzo.stoakes@...cle.com, mkoutny@...e.com, linux-mm@...ck.org,
cgroups@...r.kernel.org, linux-kernel@...r.kernel.org
Subject: Re: [PATCH v5 0/2] sched/numa: Skip VMA scanning on memory pinned to
one NUMA node via cpuset.mems
On 24/04/25 1:16 pm, Libo Chen wrote:
>
> On 4/24/25 00:05, Venkat Rao Bagalkote wrote:
>> On 24/04/25 8:15 am, Libo Chen wrote:
>>> v1->v2:
>>> 1. add perf improvment numbers in commit log. Yet to find perf diff on
>>> will-it-scale, so not included here. Plan to run more workloads.
>>> 2. add tracepoint.
>>> 3. To peterz's comment, this will make it impossible to attract tasks to
>>> those memory just like other VMA skippings. This is the current
>>> implementation, I think we can improve that in the future, but at the
>>> moment it's probabaly better to keep it consistent.
>>>
>>> v2->v3:
>>> 1. add enable_cpuset() based on Mel's suggestion but again I think it's
>>> redundant.
>>> 2. print out nodemask with %*p.. format in the tracepoint.
>>>
>>> v3->v4:
>>> 1. fix an unsafe dereference of a pointer to content not on ring buffer,
>>> namely mem_allowed_ptr in the tracepoint.
>>>
>>> v4->v5:
>>> 1. add BUILD_BUG_ON() in TP_fast_assign() to guard against future
>>> changes (particularly in size) in nodemask_t.
>>>
>>> Libo Chen (2):
>>> sched/numa: Skip VMA scanning on memory pinned to one NUMA node via
>>> cpuset.mems
>>> sched/numa: Add tracepoint that tracks the skipping of numa balancing
>>> due to cpuset memory pinning
>>>
>>> include/trace/events/sched.h | 33 +++++++++++++++++++++++++++++++++
>>> kernel/sched/fair.c | 9 +++++++++
>>> 2 files changed, 42 insertions(+)
>>>
>> Hello Libo,
>>
>>
>> For some reason I am not able to apply this patch. I am trying to test the boot warning[1].
>>
>> I am trying to apply on top of next-20250423. Below is the error. Am I missing anything?
>>
>> [1]: https://lore.kernel.org/all/20250422205740.02c4893a@canb.auug.org.au/
>> Error:
>>
>> git am -i v5_20250423_libo_chen_sched_numa_skip_vma_scanning_on_memory_pinned_to_one_numa_node_via_cpuset_mems.mbx
>> Commit Body is:
>> --------------------------
>> sched/numa: Skip VMA scanning on memory pinned to one NUMA node via cpuset.mems
>>
>> When the memory of the current task is pinned to one NUMA node by cgroup,
>> there is no point in continuing the rest of VMA scanning and hinting page
>> faults as they will just be overhead. With this change, there will be no
>> more unnecessary PTE updates or page faults in this scenario.
>>
>> We have seen up to a 6x improvement on a typical java workload running on
>> VMs with memory and CPU pinned to one NUMA node via cpuset in a two-socket
>> AARCH64 system. With the same pinning, on a 18-cores-per-socket Intel
>> platform, we have seen 20% improvment in a microbench that creates a
>> 30-vCPU selftest KVM guest with 4GB memory, where each vCPU reads 4KB
>> pages in a fixed number of loops.
>>
>> Signed-off-by: Libo Chen <libo.chen@...cle.com>
>> Tested-by: Chen Yu <yu.c.chen@...el.com>
>> Tested-by: K Prateek Nayak <kprateek.nayak@....com>
>> --------------------------
>> Apply? [y]es/[n]o/[e]dit/[v]iew patch/[a]ccept all: a
>> Applying: sched/numa: Skip VMA scanning on memory pinned to one NUMA node via cpuset.mems
>> error: patch failed: kernel/sched/fair.c:3329
>> error: kernel/sched/fair.c: patch does not apply
>> Patch failed at 0001 sched/numa: Skip VMA scanning on memory pinned to one NUMA node via cpuset.mems
>>
>>
> Hi Venkat,
>
> I just did git am -i t.mbox on top of next-20250423, not sure why but the second patch was ahead of the
> first patch in apply order, have you made sure the second patch was not applied before the first one?
>
> - Libo
Hi Libo,
Apolozies!!!
I freshly cloned and tried and it worked now. So, please ignore my
earlier mail.
Regards,
Venkat.
>> Regards,
>>
>> Venkat.
>>
>>
>>
>
Powered by blists - more mailing lists