lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <686a847db2f31_3aa6542949f@willemb.c.googlers.com.notmuch>
Date: Sun, 06 Jul 2025 10:13:17 -0400
From: Willem de Bruijn <willemdebruijn.kernel@...il.com>
To: Kuniyuki Iwashima <kuniyu@...gle.com>, 
 "David S. Miller" <davem@...emloft.net>, 
 Eric Dumazet <edumazet@...gle.com>, 
 Jakub Kicinski <kuba@...nel.org>, 
 Paolo Abeni <pabeni@...hat.com>
Cc: Simon Horman <horms@...nel.org>, 
 Kuniyuki Iwashima <kuniyu@...gle.com>, 
 Kuniyuki Iwashima <kuni1840@...il.com>, 
 netdev@...r.kernel.org
Subject: Re: [PATCH v1 net-next 4/7] af_unix: Use cached value for SOCK_STREAM
 in unix_inq_len().

Kuniyuki Iwashima wrote:
> Compared to TCP, ioctl(SIOCINQ) for AF_UNIX SOCK_STREAM socket is more
> expensive, as unix_inq_len() requires iterating through the receive queue
> and accumulating skb->len.
> 
> Let's cache the value for SOCK_STREAM to a new field during sendmsg()
> and recvmsg().
> 
> The field is protected by the receive queue lock.
> 
> Note that ioctl(SIOCINQ) for SOCK_DGRAM returns the length of the first
> skb in the queue.
> 
> SOCK_SEQPACKET still requires iterating through the queue because we do
> not touch functions shared with unix_dgram_ops.  But, if really needed,
> we can support it by switching __skb_try_recv_datagram() to a custom
> version.
> 
> Signed-off-by: Kuniyuki Iwashima <kuniyu@...gle.com>
> ---
>  include/net/af_unix.h |  1 +
>  net/unix/af_unix.c    | 38 ++++++++++++++++++++++++++++----------
>  2 files changed, 29 insertions(+), 10 deletions(-)
> 
> diff --git a/include/net/af_unix.h b/include/net/af_unix.h
> index 1af1841b7601..603f8cd026e5 100644
> --- a/include/net/af_unix.h
> +++ b/include/net/af_unix.h
> @@ -47,6 +47,7 @@ struct unix_sock {
>  #define peer_wait		peer_wq.wait
>  	wait_queue_entry_t	peer_wake;
>  	struct scm_stat		scm_stat;
> +	int			inq_len;
>  #if IS_ENABLED(CONFIG_AF_UNIX_OOB)
>  	struct sk_buff		*oob_skb;
>  #endif
> diff --git a/net/unix/af_unix.c b/net/unix/af_unix.c
> index fa2081713dad..aade29d65570 100644
> --- a/net/unix/af_unix.c
> +++ b/net/unix/af_unix.c
> @@ -2297,6 +2297,7 @@ static int queue_oob(struct sock *sk, struct msghdr *msg, struct sock *other,
>  
>  	spin_lock(&other->sk_receive_queue.lock);
>  	WRITE_ONCE(ousk->oob_skb, skb);
> +	WRITE_ONCE(ousk->inq_len, ousk->inq_len + 1);
>  	__skb_queue_tail(&other->sk_receive_queue, skb);
>  	spin_unlock(&other->sk_receive_queue.lock);
>  
> @@ -2319,6 +2320,7 @@ static int unix_stream_sendmsg(struct socket *sock, struct msghdr *msg,
>  	struct sock *sk = sock->sk;
>  	struct sk_buff *skb = NULL;
>  	struct sock *other = NULL;
> +	struct unix_sock *otheru;
>  	struct scm_cookie scm;
>  	bool fds_sent = false;
>  	int err, sent = 0;
> @@ -2342,14 +2344,16 @@ static int unix_stream_sendmsg(struct socket *sock, struct msghdr *msg,
>  	if (msg->msg_namelen) {
>  		err = READ_ONCE(sk->sk_state) == TCP_ESTABLISHED ? -EISCONN : -EOPNOTSUPP;
>  		goto out_err;
> -	} else {
> -		other = unix_peer(sk);
> -		if (!other) {
> -			err = -ENOTCONN;
> -			goto out_err;
> -		}
>  	}
>  
> +	other = unix_peer(sk);
> +	if (!other) {
> +		err = -ENOTCONN;
> +		goto out_err;
> +	}
> +
> +	otheru = unix_sk(other);
> +
>  	if (READ_ONCE(sk->sk_shutdown) & SEND_SHUTDOWN)
>  		goto out_pipe;
>  
> @@ -2418,7 +2422,12 @@ static int unix_stream_sendmsg(struct socket *sock, struct msghdr *msg,
>  
>  		unix_maybe_add_creds(skb, sk, other);
>  		scm_stat_add(other, skb);
> -		skb_queue_tail(&other->sk_receive_queue, skb);
> +
> +		spin_lock(&other->sk_receive_queue.lock);
> +		WRITE_ONCE(otheru->inq_len, otheru->inq_len + skb->len);
> +		__skb_queue_tail(&other->sk_receive_queue, skb);
> +		spin_unlock(&other->sk_receive_queue.lock);
> +

The change from spin_lock_irqsave here and below is intentional, I
assume. If respinning, worth stating explicitly.

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ