netdev - Re: [PATCH v2 net 01/15] af_unix: Set sk->sk_state under unix_state

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <4b894ffd-9fe2-4c15-a901-6765ab538a01@rbox.co>
Date: Mon, 17 Jun 2024 01:28:52 +0200
From: Michal Luczaj <mhal@...x.co>
To: Kuniyuki Iwashima <kuniyu@...zon.com>
Cc: cong.wang@...edance.com, davem@...emloft.net, edumazet@...gle.com,
 kuba@...nel.org, kuni1840@...il.com, netdev@...r.kernel.org,
 pabeni@...hat.com
Subject: Re: [PATCH v2 net 01/15] af_unix: Set sk->sk_state under
 unix_state_lock() for truly disconencted peer.

On 6/10/24 19:49, Kuniyuki Iwashima wrote:
> From: Michal Luczaj <mhal@...x.co>
> Date: Mon, 10 Jun 2024 14:55:08 +0200
>> On 6/9/24 23:03, Kuniyuki Iwashima wrote:
>>> (...)
>>> Sorry, I think I was wrong and we can't use smp_store_release()
>>> and smp_load_acquire(), and smp_[rw]mb() is needed.
>>>
>>> Given we avoid adding code in the hotpath in the original fix
>>> 8866730aed510 [0], I prefer adding unix_state_lock() in the SOCKMAP
>>> path again.
>>>
>>> [0]: https://lore.kernel.org/bpf/6545bc9f7e443_3358c208ae@john.notmuch/
>>
>> You're saying smp_wmb() in connect() is too much for the hot path, do I
>> understand correctly?
> 
> Yes, and now I think WARN_ON_ONCE() would be enough because it's unlikely
> that the delay happens between the two store ops and concurrent bpf()
> is in progress.
> 
> If syzkaller was able to hit this on vanilla kernel, we can revisit.
> 
> Then, probably we could just do s/WARN_ON_ONCE/unlikely/ because users
> who call bpf() in such a way know that the state was TCP_CLOSE while
> calling bpf().
> 
> ---8<---
> diff --git a/net/unix/unix_bpf.c b/net/unix/unix_bpf.c
> index bd84785bf8d6..46dc747349f2 100644
> --- a/net/unix/unix_bpf.c
> +++ b/net/unix/unix_bpf.c
> @@ -181,6 +181,9 @@ int unix_stream_bpf_update_proto(struct sock *sk, struct sk_psock *psock, bool r
>  	 */
>  	if (!psock->sk_pair) {
>  		sk_pair = unix_peer(sk);
> +		if (WARN_ON_ONCE(!sk_pair))
> +			return -EINVAL;
> +
>  		sock_hold(sk_pair);
>  		psock->sk_pair = sk_pair;
>  	}
> ---8<---

Oh. That's a peculiar approach :) But, hey, it's your call.

Another AF_UNIX sockmap issue is with OOB. When OOB packet is sent, skb is
added to recv queue, but also u->oob_skb is set. Here's the problem: when
this skb goes through bpf_sk_redirect_map() and is moved between socks,
oob_skb remains set on the original sock.

[   23.688994] WARNING: CPU: 2 PID: 993 at net/unix/garbage.c:351 unix_collect_queue+0x6c/0xb0
[   23.689019] CPU: 2 PID: 993 Comm: kworker/u32:13 Not tainted 6.10.0-rc2+ #137
[   23.689021] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS Arch Linux 1.16.3-1-1 04/01/2014
[   23.689024] Workqueue: events_unbound __unix_gc
[   23.689027] RIP: 0010:unix_collect_queue+0x6c/0xb0

I wanted to write a patch, but then I realized I'm not sure what's the
expected behaviour. Should the oob_skb setting follow to the skb's new sock
or should it be dropped (similarly to what is happening today with
scm_fp_list, i.e. redirect strips inflights)?