linux-kernel - Re: [PATCH net-next 1/3] net: netpoll: Defer skb

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <20241104-nimble-scallop-of-justice-4ab82f@leitao>
Date: Mon, 4 Nov 2024 12:40:00 -0800
From: Breno Leitao <leitao@...ian.org>
To: Jakub Kicinski <kuba@...nel.org>
Cc: horms@...nel.org, davem@...emloft.net, edumazet@...gle.com,
	pabeni@...hat.com, thepacketgeek@...il.com, netdev@...r.kernel.org,
	linux-kernel@...r.kernel.org, davej@...emonkey.org.uk,
	vlad.wing@...il.com, max@...sevol.com, kernel-team@...a.com,
	jiri@...nulli.us, jv@...sburgh.net, andy@...yhouse.net,
	aehkn@...hub.one, Rik van Riel <riel@...riel.com>,
	Al Viro <viro@...iv.linux.org.uk>
Subject: Re: [PATCH net-next 1/3] net: netpoll: Defer skb_pool population
 until setup success

On Fri, Nov 01, 2024 at 07:01:01PM -0700, Jakub Kicinski wrote:
> On Fri, 1 Nov 2024 11:18:29 -0700 Breno Leitao wrote:
> > > I think that a best mechanism might be something like:
> > > 
> > >  * If find_skb() needs to consume from the pool (which is rare, only
> > > when alloc_skb() fails), raise workthread that tries to repopulate the
> > > pool in the background. 
> > > 
> > >  * Eventually avoid alloc_skb() first, and getting directly from the
> > >    pool first, if the pool is depleted, try to alloc_skb(GPF_ATOMIC).
> > >    This might make the code faster, but, I don't have data yet.  
> > 
> > I've hacked this case (getting the skb from the pool first and refilling
> > it on a workqueue) today, and the performance is expressive.
> > 
> > I've tested sending 2k messages, and meassured the time it takes to
> > run `netpoll_send_udp`, which is the critical function in netpoll.
> 
> The purpose of the pool is to have a reserve in case of OOM, AFAIU.
> We may speed things up by taking the allocations out of line but
> we risk the pool being empty when we really need it.

Correct, but, in a case of OOM, I am not sure if this is going to
change the chances at all.

Let's assume the pool is full and we start getting OOMs. It doesn't
matter if alloc_skb() will fail in the critical path or in the work
thread, netpoll will have MAX_SKBS skbs buffered to use, and none will
be allocated, thus, just 32 SKBs will be used until a -ENOMEM returns.

On the other side, let's suppose there is a bunch of OOM pressure for a
while (10 SKBs are consumed for instance), and then some free memory
show up, causing the pool to be replenished. It is better
to do it in the workthread other than in the hot path.

In both cases, the chance of not having SKBs to send the packet seems to
be the same, unless I am not modeling the problem correctly.

On top of that, a few other points that this new model could help more,
in a OOM case.

1) Now with Maksysm patches, we can monitor the OOMing rate

2) With the pool per target, we can easily increase the pool size if we
want. (patchset not pushed yet)

This will also fix another corner case we have in netconsole. When
printk() holds the console/target_list locked, the upcoming code cannot
printk() anymore, otherwise it will deadlcok system. That is because a
printk() will call netconsole again (nested), and it will try to get the
console_lock/target_lock again, but that is already held. Having the
alloc_skb() out of that thot path will reduce the probability of this
happening. This is something I am planning to fix later, by just
dropping the upcoming message. Having this patch might make less packets
being dropped.