lists.openwall.net | lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC | |
Open Source and information security mailing list archives
| ||
|
Message-ID: <3a5a28c4-01a3-793c-6969-475aba3ff3b5@redhat.com> Date: Wed, 3 May 2023 17:49:34 +0200 From: Jesper Dangaard Brouer <jbrouer@...hat.com> To: Toke Høiland-Jørgensen <toke@...hat.com>, Jakub Kicinski <kuba@...nel.org> Cc: brouer@...hat.com, Ilias Apalodimas <ilias.apalodimas@...aro.org>, netdev@...r.kernel.org, Eric Dumazet <eric.dumazet@...il.com>, linux-mm@...ck.org, Mel Gorman <mgorman@...hsingularity.net>, lorenzo@...nel.org, linyunsheng@...wei.com, bpf@...r.kernel.org, "David S. Miller" <davem@...emloft.net>, Paolo Abeni <pabeni@...hat.com>, Andrew Morton <akpm@...ux-foundation.org>, willy@...radead.org Subject: Re: [PATCH RFC net-next/mm V3 1/2] page_pool: Remove workqueue in new shutdown scheme On 03/05/2023 13.18, Toke Høiland-Jørgensen wrote: > Jakub Kicinski <kuba@...nel.org> writes: > >> On Fri, 28 Apr 2023 18:16:19 +0200 Jesper Dangaard Brouer wrote: >>> This removes the workqueue scheme that periodically tests when >>> inflight reach zero such that page_pool memory can be freed. >>> >>> This change adds code to fast-path free checking for a shutdown flags >>> bit after returning PP pages. >> >> We can remove the warning without removing the entire delayed freeing >> scheme. I definitely like the SHUTDOWN flag and patch 2 but I'm a bit >> less clear on why the complexity of datapath freeing is justified. >> Can you explain? > > You mean just let the workqueue keep rescheduling itself every minute > for the (potentially) hours that skbs will stick around? Seems a bit > wasteful, doesn't it? :) I agree that this workqueue that keeps rescheduling is wasteful. It actually reschedules every second, even more wasteful. NIC drivers will have many HW RX-queues, with separate PP instances, that each can start a workqueue that resched every sec. Eric have convinced me that SKBs can "stick around" for longer than the assumptions in PP. The old PP assumptions came from XDP-return path. It is time to cleanup. > > We did see an issue where creating and tearing down lots of page pools > in a short period of time caused significant slowdowns due to the > workqueue mechanism. Lots being "thousands per second". This is possible > using the live packet mode of bpf_prog_run() for XDP, which will setup > and destroy a page pool for each syscall... Yes, the XDP live packet mode of bpf_prog_run is IMHO abusing the page_pool API. We should fix that somehow, at least the case where live packet mode is only injecting a single packet, but still creates a PP instance. The PP in live packet mode IMHO only makes sense when repeatedly sending packets that gets recycles and are pre-inited by PP. This use of PP does exemplify why is it problematic to keep the workqueue. I have considered (and could be convinced) delaying the free via call_rcu, but it also create an unfortunate backlog of work in the case of live packet mode of bpf_prog_run. --Jesper
Powered by blists - more mailing lists