netdev - Re: [PATCH net-next 1/3] net: netpoll: Defer skb

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <20241105170029.719344e7@kernel.org>
Date: Tue, 5 Nov 2024 17:00:29 -0800
From: Jakub Kicinski <kuba@...nel.org>
To: Breno Leitao <leitao@...ian.org>
Cc: horms@...nel.org, davem@...emloft.net, edumazet@...gle.com,
 pabeni@...hat.com, thepacketgeek@...il.com, netdev@...r.kernel.org,
 linux-kernel@...r.kernel.org, davej@...emonkey.org.uk, vlad.wing@...il.com,
 max@...sevol.com, kernel-team@...a.com, jiri@...nulli.us, jv@...sburgh.net,
 andy@...yhouse.net, aehkn@...hub.one, Rik van Riel <riel@...riel.com>, Al
 Viro <viro@...iv.linux.org.uk>
Subject: Re: [PATCH net-next 1/3] net: netpoll: Defer skb_pool population
 until setup success

On Mon, 4 Nov 2024 12:40:00 -0800 Breno Leitao wrote:
> Let's assume the pool is full and we start getting OOMs. It doesn't
> matter if alloc_skb() will fail in the critical path or in the work
> thread, netpoll will have MAX_SKBS skbs buffered to use, and none will
> be allocated, thus, just 32 SKBs will be used until a -ENOMEM returns.

Do you assume the worker thread will basically keep up with the output?
Vadim was showing me a system earlier today where workqueue workers
didn't get scheduled in for minutes :( That's a bit extreme but doesn't
inspire confidence in worker replenishing the pool quickly.

> On the other side, let's suppose there is a bunch of OOM pressure for a
> while (10 SKBs are consumed for instance), and then some free memory
> show up, causing the pool to be replenished. It is better
> to do it in the workthread other than in the hot path.

We could cap how much we replenish in one go?

> In both cases, the chance of not having SKBs to send the packet seems to
> be the same, unless I am not modeling the problem correctly.

Maybe I misunderstood the proposal, I think you said earlier that you
want to consume from the pool instead of calling alloc(). If you mean
that we'd still alloc in the fast path but not replenish the pool
that's different.

> On top of that, a few other points that this new model could help more,
> in a OOM case.
> 
> 1) Now with Maksysm patches, we can monitor the OOMing rate
> 
> 2) With the pool per target, we can easily increase the pool size if we
> want. (patchset not pushed yet)