lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <1c342d98-11d5-444c-825a-6af716d1dce8@huawei.com>
Date: Wed, 28 Aug 2024 20:11:39 +0800
From: Yunsheng Lin <linyunsheng@...wei.com>
To: Alexander Duyck <alexander.duyck@...il.com>
CC: <davem@...emloft.net>, <kuba@...nel.org>, <pabeni@...hat.com>,
	<netdev@...r.kernel.org>, <linux-kernel@...r.kernel.org>, Andrew Morton
	<akpm@...ux-foundation.org>, <linux-mm@...ck.org>
Subject: Re: [PATCH net-next v15 06/13] mm: page_frag: reuse existing space
 for 'size' and 'pfmemalloc'

On 2024/8/28 2:16, Alexander Duyck wrote:
> On Tue, Aug 27, 2024 at 5:06 AM Yunsheng Lin <linyunsheng@...wei.com> wrote:
>>
>> On 2024/8/27 0:46, Alexander Duyck wrote:
>>> On Mon, Aug 26, 2024 at 5:46 AM Yunsheng Lin <linyunsheng@...wei.com> wrote:
>>>>
>>>> Currently there is one 'struct page_frag' for every 'struct
>>>> sock' and 'struct task_struct', we are about to replace the
>>>> 'struct page_frag' with 'struct page_frag_cache' for them.
>>>> Before begin the replacing, we need to ensure the size of
>>>> 'struct page_frag_cache' is not bigger than the size of
>>>> 'struct page_frag', as there may be tens of thousands of
>>>> 'struct sock' and 'struct task_struct' instances in the
>>>> system.
>>>>
>>>> By or'ing the page order & pfmemalloc with lower bits of
>>>> 'va' instead of using 'u16' or 'u32' for page size and 'u8'
>>>> for pfmemalloc, we are able to avoid 3 or 5 bytes space waste.
>>>> And page address & pfmemalloc & order is unchanged for the
>>>> same page in the same 'page_frag_cache' instance, it makes
>>>> sense to fit them together.
>>>>
>>>> After this patch, the size of 'struct page_frag_cache' should be
>>>> the same as the size of 'struct page_frag'.
>>>>
>>>> CC: Alexander Duyck <alexander.duyck@...il.com>
>>>> Signed-off-by: Yunsheng Lin <linyunsheng@...wei.com>
>>>> ---
>>>>  include/linux/mm_types_task.h   | 19 ++++++-----
>>>>  include/linux/page_frag_cache.h | 60 +++++++++++++++++++++++++++++++--
>>>>  mm/page_frag_cache.c            | 51 +++++++++++++++-------------
>>>>  3 files changed, 97 insertions(+), 33 deletions(-)
>>>>
> 
> ...
> 
>>>>  void page_frag_cache_drain(struct page_frag_cache *nc);
>>>
>>> So how many of these additions are actually needed outside of the
>>> page_frag_cache.c file? I'm just wondering if we could look at moving
>>
>> At least page_frag_cache_is_pfmemalloc(), page_frag_encoded_page_order(),
>> page_frag_encoded_page_ptr(), page_frag_encoded_page_address() are needed
>> out of the page_frag_cache.c file for now, which are used mostly in
>> __page_frag_cache_commit() and __page_frag_alloc_refill_probe_align() for
>> debugging and performance reason, see patch 7 & 10.
> 
> As far as the __page_frag_cache_commit I might say that could be moved
> to page_frag_cache.c, but admittedly I don't know how much that would
> impact the performance.

The performance impact seems large enough that it does not seem to justify
the moving to page_frag_cache.c,

Before the moving:
 Performance counter stats for 'insmod page_frag_test.ko test_push_cpu=16 test_pop_cpu=17 test_alloc_len=256 nr_test=512000000 test_align=0 test_prepare=0' (20 runs):

         17.749582      task-clock (msec)         #    0.002 CPUs utilized            ( +-  0.15% )
                 5      context-switches          #    0.304 K/sec                    ( +-  2.48% )
                 0      cpu-migrations            #    0.017 K/sec                    ( +- 35.04% )
                76      page-faults               #    0.004 M/sec                    ( +-  0.45% )
          46103462      cycles                    #    2.597 GHz                      ( +-  0.14% )
          60692196      instructions              #    1.32  insn per cycle           ( +-  0.12% )
          14734050      branches                  #  830.107 M/sec                    ( +-  0.12% )
             19792      branch-misses             #    0.13% of all branches          ( +-  0.75% )

       9.837758611 seconds time elapsed                                          ( +-  0.38% )


After the moving:

 Performance counter stats for 'insmod page_frag_test.ko test_push_cpu=16 test_pop_cpu=17 test_alloc_len=256 nr_test=512000000 test_align=0 test_prepare=0' (20 runs):

         19.682296      task-clock (msec)         #    0.002 CPUs utilized            ( +-  4.08% )
                 6      context-switches          #    0.305 K/sec                    ( +-  3.42% )
                 0      cpu-migrations            #    0.000 K/sec
                76      page-faults               #    0.004 M/sec                    ( +-  0.44% )
          51128091      cycles                    #    2.598 GHz                      ( +-  4.08% )
          58833583      instructions              #    1.15  insn per cycle           ( +-  4.50% )
          14260855      branches                  #  724.552 M/sec                    ( +-  4.63% )
             20120      branch-misses             #    0.14% of all branches          ( +-  0.92% )

      12.318770150 seconds time elapsed                                          ( +-  0.15% )

> 
>> The only left one is page_frag_encode_page(), I am not sure if it makes
>> much sense to move it to page_frag_cache.c while the rest of them are in
>> .h file.
> 
> I would move it. There is no point in exposing internals more than
> necessary. Also since you are carrying a BUILD_BUG_ON it would make
> sense to keep that internal to your implementation.

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ