linux-kernel - Re: [PATCH bpf-next] bpf/test_run: increase Page Pool's ptr

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <87y1bmd4zg.fsf@toke.dk>
Date: Thu, 15 Feb 2024 00:02:27 +0100
From: Toke Høiland-Jørgensen <toke@...hat.com>
To: Alexander Lobakin <aleksander.lobakin@...el.com>, Alexei Starovoitov
 <ast@...nel.org>, Daniel Borkmann <daniel@...earbox.net>, Andrii Nakryiko
 <andrii@...nel.org>
Cc: Alexander Lobakin <aleksander.lobakin@...el.com>, Martin KaFai Lau
 <martin.lau@...ux.dev>, Jakub Kicinski <kuba@...nel.org>, Maciej
 Fijalkowski <maciej.fijalkowski@...el.com>, bpf@...r.kernel.org,
 netdev@...r.kernel.org, linux-kernel@...r.kernel.org
Subject: Re: [PATCH bpf-next] bpf/test_run: increase Page Pool's ptr_ring
 size in live frames mode

Toke Høiland-Jørgensen <toke@...hat.com> writes:

> Alexander Lobakin <aleksander.lobakin@...el.com> writes:
>
>> Currently, when running xdp-trafficgen, test_run creates page_pools with
>> the ptr_ring size of %NAPI_POLL_WEIGHT (64).
>> This might work fine if XDP Tx queues are polled with the budget
>> limitation. However, we often clear them with no limitation to ensure
>> maximum free space when sending.
>> For example, in ice and idpf (upcoming), we use "lazy" cleaning, i.e. we
>> clean XDP Tx queue only when the free space there is less than 1/4 of
>> the queue size. Let's take the ring size of 512 just as an example. 3/4
>> of the ring is 384 and often times, when we're entering the cleaning
>> function, we have this whole amount ready (or 256 or 192, doesn't
>> matter).
>> Then we're calling xdp_return_frame_bulk() and after 64th frame,
>> page_pool_put_page_bulk() starts returning pages to the page allocator
>> due to that the ptr_ring is already full. put_page(), alloc_page() et at
>> starts consuming a ton of CPU time and leading the board of the perf top
>> output.
>>
>> Let's not limit ptr_ring to 64 for no real reason and allow more pages
>> to be recycled. Just don't put anything to page_pool_params::size and
>> let the Page Pool core pick the default of 1024 entries (I don't believe
>> there are real use cases to clean more than that amount of descriptors).
>> After the change, the MM layer disappears from the perf top output and
>> all pages get recycled to the PP. On my test setup on idpf with the
>> default ring size (512), this gives +80% of Tx performance with no
>> visible memory consumption increase.
>>
>> Signed-off-by: Alexander Lobakin <aleksander.lobakin@...el.com>
>
> Hmm, so my original idea with keeping this low was to avoid having a lot
> of large rings lying around if it is used by multiple processes at once.
> But we need to move away from the per-syscall allocation anyway, and
> with Lorenzo's patches introducing a global system page pool we have an
> avenue for that. So in the meantime, I have no objection to this...
>
> Reviewed-by: Toke Høiland-Jørgensen <toke@...hat.com>

Actually, since Lorenzo's patches already landed in net-next, let's just
move to using those straight away. I'll send a patch for this tomorrow :)

-Toke