[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <24240fe0-00ca-a9cc-6087-1de720951896@huawei.com>
Date: Fri, 31 May 2024 16:50:35 +0800
From: Yunsheng Lin <linyunsheng@...wei.com>
To: Jakub Kicinski <kuba@...nel.org>
CC: <davem@...emloft.net>, <pabeni@...hat.com>, <netdev@...r.kernel.org>,
<linux-kernel@...r.kernel.org>, Alexander Duyck <alexander.duyck@...il.com>,
Andrew Morton <akpm@...ux-foundation.org>, <linux-mm@...ck.org>
Subject: Re: [PATCH net-next v5 01/13] mm: page_frag: add a test module for
page_frag
On 2024/5/30 23:16, Jakub Kicinski wrote:
> On Thu, 30 May 2024 17:17:17 +0800 Yunsheng Lin wrote:
>>> Is this test actually meaningfully testing page_frag or rather
>>> the objpool construct and the scheduler? :S
>>
>> For the objpool part, I guess it is ok to say that it is a
>> meaningfully testing for both page_frag and objpool if there is
>> changing to either of them.
>
> Why guess when you can measure it.
> Slow one down and see if it impacts the benchmark.
Before the slowing down on arm64 system:
Performance counter stats for 'insmod ./page_frag_test.ko test_push_cpu=16 test_pop_cpu=17' (500 runs):
19.420606 task-clock (msec) # 0.001 CPUs utilized ( +- 0.82% )
7 context-switches # 0.377 K/sec ( +- 0.30% )
1 cpu-migrations # 0.038 K/sec ( +- 2.82% )
84 page-faults # 0.004 M/sec ( +- 0.06% )
50423999 cycles # 2.596 GHz ( +- 0.82% )
35558295 instructions # 0.71 insn per cycle ( +- 0.09% )
8340405 branches # 429.462 M/sec ( +- 0.08% )
20669 branch-misses # 0.25% of all branches ( +- 0.10% )
24.047641626 seconds time elapsed ( +- 0.08% )
And there are 5120000 push and pop operations for each iteration,
so roughly each push and pop operation costs about 4687ns.
By adding 50ns delay in *__page_frag_alloc_va_align()
@@ -300,6 +297,8 @@ void *__page_frag_alloc_va_align(struct page_frag_cache *nc,
{
unsigned int remaining = nc->remaining & align_mask;
+ ndelay(50);
+
if (unlikely(fragsz > remaining)) {
We have:
Performance counter stats for 'insmod ./page_frag_test.ko test_push_cpu=16 test_pop_cpu=17' (500 runs):
18.012657 task-clock (msec) # 0.001 CPUs utilized ( +- 0.01% )
7 context-switches # 0.395 K/sec ( +- 0.20% )
1 cpu-migrations # 0.052 K/sec ( +- 1.35% )
84 page-faults # 0.005 M/sec ( +- 0.06% )
46765406 cycles # 2.596 GHz ( +- 0.01% )
35253336 instructions # 0.75 insn per cycle ( +- 0.00% )
8277063 branches # 459.514 M/sec ( +- 0.00% )
20558 branch-misses # 0.25% of all branches ( +- 0.07% )
24.313647557 seconds time elapsed ( +- 0.07% )
(24.313647557 - 24.047641626) * 1000000000 / 5120000 = 51ns, so the
testing seems correct.
>
>> For the scheduler part, this test provides the below module param
>> to avoid the the noise from scheduler.
>>
>> +static int test_push_cpu;
>> +module_param(test_push_cpu, int, 0600);
>> +MODULE_PARM_DESC(test_push_cpu, "test cpu for pushing fragment");
>> +
>> +static int test_pop_cpu;
>> +module_param(test_pop_cpu, int, 0600);
>> +MODULE_PARM_DESC(test_pop_cpu, "test cpu for popping fragment");
>>
>> Or is there any better idea for testing page_frag?
>
> .
>
Powered by blists - more mailing lists