[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <0bd0b744-3d97-b4c3-a4fb-6040f8f8024a@bytedance.com>
Date: Wed, 16 Nov 2022 19:28:10 +0800
From: Zhongkun He <hezhongkun.hzk@...edance.com>
To: Michal Hocko <mhocko@...e.com>
Cc: Andrew Morton <akpm@...ux-foundation.org>, corbet@....net,
linux-mm@...ck.org, linux-kernel@...r.kernel.org,
linux-api@...r.kernel.org, linux-doc@...r.kernel.org
Subject: Re: [External] Re: [PATCH v2] mm: add new syscall
pidfd_set_mempolicy().
Hi Michal, I've done the performance testing, please check it out.
>> Yes this is all understood but the level of the overhead is not really
>> clear. So the question is whether this will induce a visible overhead.
>> Because from the maintainability point of view it is much less costly to
>> have a clear life time model. Right now we have a mix of reference
>> counting and per-task requirements which is rather subtle and easy to
>> get wrong. In an ideal world we would have get_vma_policy always
>> returning a reference counted policy or NULL. If we really need to
>> optimize for cache line bouncing we can go with per cpu reference
>> counters (something that was not available at the time the mempolicy
>> code has been introduced).
>>
>> So I am not saying that the task_work based solution is not possible I
>> just think that this looks like a good opportunity to get from the
>> existing subtle model.
Test tools:
numactl -m 0-3 ./run-mmtests.sh -n -c configs/config-workload-
aim9-pagealloc test_name
Modification:
Get_vma_policy(), get_task_policy() always returning a reference
counted policy, except for the static policy(default_policy and
preferred_node_policy[nid]).
All vma manipulation is protected by a down_read, so mpol_get()
can be called directly to take a refcount on the mpol. but there
is no lock in task->mempolicy context.
so task->mempolicy should be protected by task_lock.
struct mempolicy *get_task_policy(struct task_struct *p)
{
struct mempolicy *pol;
int node;
if (p->mempolicy) {
task_lock(p);
pol = p->mempolicy;
mpol_get(pol);
task_unlock(p);
if (pol)
return pol;
}
.....
}
Test Case1:
Describe:
Test directly, no other user processes.
Result:
This will degrade performance about 1% to 3%.
For more information, please see the attachment:mpol.txt
aim9
Hmean page_test 484561.68 ( 0.00%) 471039.34 * -2.79%*
Hmean brk_test 1400702.48 ( 0.00%) 1388949.10 * -0.84%*
Hmean exec_test 2339.45 ( 0.00%) 2278.41 * -2.61%*
Hmean fork_test 6500.02 ( 0.00%) 6500.17 * 0.00%*
Test Case2:
Describe:
Added a user process, top.
Result:
This will degrade performance about 2.1%.
For more information, please see the attachment:mpol_top.txt
Hmean page_test 477916.47 ( 0.00%) 467829.01 * -2.11%*
Hmean brk_test 1351439.76 ( 0.00%) 1373663.90 * 1.64%*
Hmean exec_test 2312.24 ( 0.00%) 2296.06 * -0.70%*
Hmean fork_test 6483.46 ( 0.00%) 6472.06 * -0.18%*
Test Case3:
Describe:
Add a daemon to read /proc/$test_pid/status, which will acquire
task_lock. while :;do cat /proc/$(pidof singleuser)/status;done
Result:
the baseline is degrade from 484561(case1) to 438591(about 10%)
when the daemon was add, but the performance degradation in case3 is
about 3.2%. For more information, please see the
attachment:mpol_status.txt
Hmean page_test 438591.97 ( 0.00%) 424251.22 * -3.27%*
Hmean brk_test 1268906.57 ( 0.00%) 1278100.12 * 0.72%*
Hmean exec_test 2301.19 ( 0.00%) 2192.71 * -4.71%*
Hmean fork_test 6453.24 ( 0.00%) 6090.48 * -5.62%*
Thanks,
Zhongkun.
View attachment "mpol.txt" of type "text/plain" (14076 bytes)
View attachment "mpol_status.txt" of type "text/plain" (7803 bytes)
View attachment "mpol_top.txt" of type "text/plain" (7731 bytes)
Powered by blists - more mailing lists