[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <1c342d98-11d5-444c-825a-6af716d1dce8@huawei.com>
Date: Wed, 28 Aug 2024 20:11:39 +0800
From: Yunsheng Lin <linyunsheng@...wei.com>
To: Alexander Duyck <alexander.duyck@...il.com>
CC: <davem@...emloft.net>, <kuba@...nel.org>, <pabeni@...hat.com>,
<netdev@...r.kernel.org>, <linux-kernel@...r.kernel.org>, Andrew Morton
<akpm@...ux-foundation.org>, <linux-mm@...ck.org>
Subject: Re: [PATCH net-next v15 06/13] mm: page_frag: reuse existing space
for 'size' and 'pfmemalloc'
On 2024/8/28 2:16, Alexander Duyck wrote:
> On Tue, Aug 27, 2024 at 5:06 AM Yunsheng Lin <linyunsheng@...wei.com> wrote:
>>
>> On 2024/8/27 0:46, Alexander Duyck wrote:
>>> On Mon, Aug 26, 2024 at 5:46 AM Yunsheng Lin <linyunsheng@...wei.com> wrote:
>>>>
>>>> Currently there is one 'struct page_frag' for every 'struct
>>>> sock' and 'struct task_struct', we are about to replace the
>>>> 'struct page_frag' with 'struct page_frag_cache' for them.
>>>> Before begin the replacing, we need to ensure the size of
>>>> 'struct page_frag_cache' is not bigger than the size of
>>>> 'struct page_frag', as there may be tens of thousands of
>>>> 'struct sock' and 'struct task_struct' instances in the
>>>> system.
>>>>
>>>> By or'ing the page order & pfmemalloc with lower bits of
>>>> 'va' instead of using 'u16' or 'u32' for page size and 'u8'
>>>> for pfmemalloc, we are able to avoid 3 or 5 bytes space waste.
>>>> And page address & pfmemalloc & order is unchanged for the
>>>> same page in the same 'page_frag_cache' instance, it makes
>>>> sense to fit them together.
>>>>
>>>> After this patch, the size of 'struct page_frag_cache' should be
>>>> the same as the size of 'struct page_frag'.
>>>>
>>>> CC: Alexander Duyck <alexander.duyck@...il.com>
>>>> Signed-off-by: Yunsheng Lin <linyunsheng@...wei.com>
>>>> ---
>>>> include/linux/mm_types_task.h | 19 ++++++-----
>>>> include/linux/page_frag_cache.h | 60 +++++++++++++++++++++++++++++++--
>>>> mm/page_frag_cache.c | 51 +++++++++++++++-------------
>>>> 3 files changed, 97 insertions(+), 33 deletions(-)
>>>>
>
> ...
>
>>>> void page_frag_cache_drain(struct page_frag_cache *nc);
>>>
>>> So how many of these additions are actually needed outside of the
>>> page_frag_cache.c file? I'm just wondering if we could look at moving
>>
>> At least page_frag_cache_is_pfmemalloc(), page_frag_encoded_page_order(),
>> page_frag_encoded_page_ptr(), page_frag_encoded_page_address() are needed
>> out of the page_frag_cache.c file for now, which are used mostly in
>> __page_frag_cache_commit() and __page_frag_alloc_refill_probe_align() for
>> debugging and performance reason, see patch 7 & 10.
>
> As far as the __page_frag_cache_commit I might say that could be moved
> to page_frag_cache.c, but admittedly I don't know how much that would
> impact the performance.
The performance impact seems large enough that it does not seem to justify
the moving to page_frag_cache.c,
Before the moving:
Performance counter stats for 'insmod page_frag_test.ko test_push_cpu=16 test_pop_cpu=17 test_alloc_len=256 nr_test=512000000 test_align=0 test_prepare=0' (20 runs):
17.749582 task-clock (msec) # 0.002 CPUs utilized ( +- 0.15% )
5 context-switches # 0.304 K/sec ( +- 2.48% )
0 cpu-migrations # 0.017 K/sec ( +- 35.04% )
76 page-faults # 0.004 M/sec ( +- 0.45% )
46103462 cycles # 2.597 GHz ( +- 0.14% )
60692196 instructions # 1.32 insn per cycle ( +- 0.12% )
14734050 branches # 830.107 M/sec ( +- 0.12% )
19792 branch-misses # 0.13% of all branches ( +- 0.75% )
9.837758611 seconds time elapsed ( +- 0.38% )
After the moving:
Performance counter stats for 'insmod page_frag_test.ko test_push_cpu=16 test_pop_cpu=17 test_alloc_len=256 nr_test=512000000 test_align=0 test_prepare=0' (20 runs):
19.682296 task-clock (msec) # 0.002 CPUs utilized ( +- 4.08% )
6 context-switches # 0.305 K/sec ( +- 3.42% )
0 cpu-migrations # 0.000 K/sec
76 page-faults # 0.004 M/sec ( +- 0.44% )
51128091 cycles # 2.598 GHz ( +- 4.08% )
58833583 instructions # 1.15 insn per cycle ( +- 4.50% )
14260855 branches # 724.552 M/sec ( +- 4.63% )
20120 branch-misses # 0.14% of all branches ( +- 0.92% )
12.318770150 seconds time elapsed ( +- 0.15% )
>
>> The only left one is page_frag_encode_page(), I am not sure if it makes
>> much sense to move it to page_frag_cache.c while the rest of them are in
>> .h file.
>
> I would move it. There is no point in exposing internals more than
> necessary. Also since you are carrying a BUILD_BUG_ON it would make
> sense to keep that internal to your implementation.
Powered by blists - more mailing lists