netdev - Re: [RFC, RESEND] UDP receive path batching improvement

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <aKg1Qgtw-QyE8bLx@bzorp3>
Date: Fri, 22 Aug 2025 11:15:46 +0200
From: Balazs Scheidler <bazsi77@...il.com>
To: Eric Dumazet <edumazet@...gle.com>
Cc: netdev@...r.kernel.org, pabeni@...hat.com
Subject: Re: [RFC, RESEND] UDP receive path batching improvement

On Fri, Aug 22, 2025 at 01:18:36AM -0700, Eric Dumazet wrote:
> On Fri, Aug 22, 2025 at 1:15 AM Balazs Scheidler <bazsi77@...il.com> wrote:
> > The condition above uses "sk->sk_rcvbuf >> 2" as a trigger when the update is
> > done to the counter.
> >
> > In our case (syslog receive path via udp), socket buffers are generally
> > tuned up (in the order of 32MB or even more, I have seen 256MB as well), as
> > the senders can generate spikes in their traffic and a lot of senders send
> > to the same port. Due to latencies, sometimes these buffers take MBs of data
> > before the user-space process even has a chance to consume them.
> >
> 
> 
> This seems very high usage for a single UDP socket.
> 
> Have you tried SO_REUSEPORT to spread incoming packets to more sockets
> (and possibly more threads) ?

Yes.  I use SO_REUSEPORT (16 sockets), I even use eBPF to distribute the
load over multiple sockets evenly, instead of the normal load balancing
algorithm built into SO_REUSEPORT.

Sometimes the processing on the userspace side is heavy enough (think of
parsing, heuristics, data normalization) and the load on the box heavy
enough that I still see drops from time to time.

If a client sends 100k messages in a tight loop for a while, that's going to
use a lot of buffer space.  What bothers me further is that it could be ok
to lose a single packet, but any time we drop one packet, we will continue
to lose all of them, at least until we fetch 25% of SO_RCVBUF (or if the
receive buffer is completely emptied).  This problem, combined with small
packets (think of 100-150 byte payload) can easily cause excessive drops. 25%
of the socket buffer is a huge offset. 

I am not sure how many packets warrants a sk_rmem_alloc update, but I'd
assume that 1 update every 100 packets should still be OK.

-- 
Bazsi
Happy Logging!