[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <CANn89iLNnYXH0z4BOc0UZjvbuZ5gWWHVTP1MrOHkVUq26szCKA@mail.gmail.com>
Date: Mon, 25 Aug 2025 23:46:18 -0700
From: Eric Dumazet <edumazet@...gle.com>
To: Paolo Abeni <pabeni@...hat.com>
Cc: "David S . Miller" <davem@...emloft.net>, Jakub Kicinski <kuba@...nel.org>,
Simon Horman <horms@...nel.org>, netdev@...r.kernel.org, eric.dumazet@...il.com,
Willem de Bruijn <willemb@...gle.com>, Kuniyuki Iwashima <kuniyu@...gle.com>
Subject: Re: [PATCH net-next 3/3] net: add new sk->sk_drops1 field
On Mon, Aug 25, 2025 at 11:34 PM Paolo Abeni <pabeni@...hat.com> wrote:
>
> On 8/25/25 9:59 PM, Eric Dumazet wrote:
> > sk->sk_drops can be heavily contended when
> > changed from many cpus.
> >
> > Instead using too expensive per-cpu data structure,
> > add a second sk->sk_drops1 field and change
> > sk_drops_inc() to be NUMA aware.
> >
> > This patch adds 64 bytes per socket.
>
> I'm wondering: since the main target for dealing with drops are UDP
> sockets, have you considered adding sk_drops1 to udp_sock, instead?
I actually saw the issues on RAW sockets, some applications were using them
in a non appropriate way. This was not an attack on single UDP sockets, but
a self-inflicted issue on RAW sockets.
Author: Eric Dumazet <edumazet@...gle.com>
Date: Thu Mar 7 16:29:43 2024 +0000
ipv6: raw: check sk->sk_rcvbuf earlier
There is no point cloning an skb and having to free the clone
if the receive queue of the raw socket is full.
Signed-off-by: Eric Dumazet <edumazet@...gle.com>
Reviewed-by: Willem de Bruijn <willemb@...gle.com>
Link: https://lore.kernel.org/r/20240307162943.2523817-1-edumazet@google.com
Signed-off-by: Jakub Kicinski <kuba@...nel.org>
>
> Plus an additional conditional/casting in sk_drops_{read,inc,reset}.
>
> That would save some memory also offer the opportunity to use more
> memory to deal with NUMA hosts.
>
> (I had the crazy idea to keep sk_drop on a contended cacheline and use 2
> (or more) cacheline aligned fields for udp_sock only).
I am working on rmem_alloc batches on both producer and consumer
as a follow up of recent thread on netdev :
https://lore.kernel.org/netdev/aKh_yi0gASYajhev@bzorp3/T/#m392d5c87ab08d6ae005c23ffc8a3186cbac07cf2
Right now, when multiple cpus (running on different NUMA nodes) are
feeding packets to __udp_enqueue_schedule_skb()
we are touching two cache lines, my plan is to reduce this to a single one.
Powered by blists - more mailing lists