netdev - Re: [PATCH v3 net-next] udp: remove busylock and add per NUMA queues

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite for Android: free password hash cracker in your pocket

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <3fbe9533-72e9-4667-9cf4-57dd2acf375c@redhat.com>
Date: Mon, 22 Sep 2025 10:37:49 +0200
From: Paolo Abeni <pabeni@...hat.com>
To: Eric Dumazet <edumazet@...gle.com>, "David S . Miller"
 <davem@...emloft.net>, Jakub Kicinski <kuba@...nel.org>
Cc: Simon Horman <horms@...nel.org>, Willem de Bruijn <willemb@...gle.com>,
 Kuniyuki Iwashima <kuniyu@...gle.com>, netdev@...r.kernel.org,
 eric.dumazet@...il.com
Subject: Re: [PATCH v3 net-next] udp: remove busylock and add per NUMA queues

Hi,

On 9/21/25 11:58 AM, Eric Dumazet wrote:
> @@ -1718,14 +1699,23 @@ static int udp_rmem_schedule(struct sock *sk, int size)
>  int __udp_enqueue_schedule_skb(struct sock *sk, struct sk_buff *skb)
>  {
>  	struct sk_buff_head *list = &sk->sk_receive_queue;
> +	struct udp_prod_queue *udp_prod_queue;
> +	struct sk_buff *next, *to_drop = NULL;
> +	struct llist_node *ll_list;
>  	unsigned int rmem, rcvbuf;
> -	spinlock_t *busy = NULL;
>  	int size, err = -ENOMEM;
> +	int total_size = 0;
> +	int q_size = 0;
> +	int nb = 0;
>  
>  	rmem = atomic_read(&sk->sk_rmem_alloc);
>  	rcvbuf = READ_ONCE(sk->sk_rcvbuf);
>  	size = skb->truesize;
>  
> +	udp_prod_queue = &udp_sk(sk)->udp_prod_queue[numa_node_id()];
> +
> +	rmem += atomic_read(&udp_prod_queue->rmem_alloc);
> +
>  	/* Immediately drop when the receive queue is full.
>  	 * Cast to unsigned int performs the boundary check for INT_MAX.
>  	 */

Double checking I'm reading the code correctly... AFAICS the rcvbuf size
check is now only per NUMA node, that means that each node can now add
at most sk_rcvbuf bytes to the socket receive queue simultaneously, am I
correct?

What if the user-space process never reads the packets (or is very
slow)? I'm under the impression the max rcvbuf occupation will be
limited only by the memory accounting?!? (and not by sk_rcvbuf)

Side note: I'm wondering if we could avoid the numa queue for connected
sockets? With early demux, and no nft/bridge in between the path from
NIC to socket should be pretty fast and possibly the additional queuing
visible?

Thanks,

Paolo