netdev - Re: [PATCH net-next 3/3] net: add new sk->sk

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <CANn89iLNnYXH0z4BOc0UZjvbuZ5gWWHVTP1MrOHkVUq26szCKA@mail.gmail.com>
Date: Mon, 25 Aug 2025 23:46:18 -0700
From: Eric Dumazet <edumazet@...gle.com>
To: Paolo Abeni <pabeni@...hat.com>
Cc: "David S . Miller" <davem@...emloft.net>, Jakub Kicinski <kuba@...nel.org>, 
	Simon Horman <horms@...nel.org>, netdev@...r.kernel.org, eric.dumazet@...il.com, 
	Willem de Bruijn <willemb@...gle.com>, Kuniyuki Iwashima <kuniyu@...gle.com>
Subject: Re: [PATCH net-next 3/3] net: add new sk->sk_drops1 field

On Mon, Aug 25, 2025 at 11:34 PM Paolo Abeni <pabeni@...hat.com> wrote:
>
> On 8/25/25 9:59 PM, Eric Dumazet wrote:
> > sk->sk_drops can be heavily contended when
> > changed from many cpus.
> >
> > Instead using too expensive per-cpu data structure,
> > add a second sk->sk_drops1 field and change
> > sk_drops_inc() to be NUMA aware.
> >
> > This patch adds 64 bytes per socket.
>
> I'm wondering: since the main target for dealing with drops are UDP
> sockets, have you considered adding sk_drops1 to udp_sock, instead?

I actually saw the issues on RAW sockets, some applications were using them
in a non appropriate way. This was not an attack on single UDP sockets, but
a self-inflicted issue on RAW sockets.

Author: Eric Dumazet <edumazet@...gle.com>
Date:   Thu Mar 7 16:29:43 2024 +0000

    ipv6: raw: check sk->sk_rcvbuf earlier

    There is no point cloning an skb and having to free the clone
    if the receive queue of the raw socket is full.

    Signed-off-by: Eric Dumazet <edumazet@...gle.com>
    Reviewed-by: Willem de Bruijn <willemb@...gle.com>
    Link: https://lore.kernel.org/r/20240307162943.2523817-1-edumazet@google.com
    Signed-off-by: Jakub Kicinski <kuba@...nel.org>


>
> Plus an additional conditional/casting in sk_drops_{read,inc,reset}.
>
> That would save some memory also offer the opportunity to use more
> memory to deal with  NUMA hosts.
>
> (I had the crazy idea to keep sk_drop on a contended cacheline and use 2
> (or more) cacheline aligned fields for udp_sock only).

I am working on rmem_alloc batches on both producer and consumer
as a follow up of recent thread on netdev :

https://lore.kernel.org/netdev/aKh_yi0gASYajhev@bzorp3/T/#m392d5c87ab08d6ae005c23ffc8a3186cbac07cf2

Right now, when multiple cpus (running on different NUMA nodes) are
feeding packets to __udp_enqueue_schedule_skb()
we are touching two cache lines, my plan is to reduce this to a single one.