lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20240617182146.62299-1-kuniyu@amazon.com>
Date: Mon, 17 Jun 2024 11:21:46 -0700
From: Kuniyuki Iwashima <kuniyu@...zon.com>
To: <mhal@...x.co>
CC: <cong.wang@...edance.com>, <davem@...emloft.net>, <edumazet@...gle.com>,
	<kuba@...nel.org>, <kuni1840@...il.com>, <kuniyu@...zon.com>,
	<netdev@...r.kernel.org>, <pabeni@...hat.com>
Subject: Re: [PATCH v2 net 01/15] af_unix: Set sk->sk_state under unix_state_lock() for truly disconencted peer.

From: Michal Luczaj <mhal@...x.co>
Date: Mon, 17 Jun 2024 01:28:52 +0200
> On 6/10/24 19:49, Kuniyuki Iwashima wrote:
> > From: Michal Luczaj <mhal@...x.co>
> > Date: Mon, 10 Jun 2024 14:55:08 +0200
> >> On 6/9/24 23:03, Kuniyuki Iwashima wrote:
> >>> (...)
> >>> Sorry, I think I was wrong and we can't use smp_store_release()
> >>> and smp_load_acquire(), and smp_[rw]mb() is needed.
> >>>
> >>> Given we avoid adding code in the hotpath in the original fix
> >>> 8866730aed510 [0], I prefer adding unix_state_lock() in the SOCKMAP
> >>> path again.
> >>>
> >>> [0]: https://lore.kernel.org/bpf/6545bc9f7e443_3358c208ae@john.notmuch/
> >>
> >> You're saying smp_wmb() in connect() is too much for the hot path, do I
> >> understand correctly?
> > 
> > Yes, and now I think WARN_ON_ONCE() would be enough because it's unlikely
> > that the delay happens between the two store ops and concurrent bpf()
> > is in progress.
> > 
> > If syzkaller was able to hit this on vanilla kernel, we can revisit.
> > 
> > Then, probably we could just do s/WARN_ON_ONCE/unlikely/ because users
> > who call bpf() in such a way know that the state was TCP_CLOSE while
> > calling bpf().
> > 
> > ---8<---
> > diff --git a/net/unix/unix_bpf.c b/net/unix/unix_bpf.c
> > index bd84785bf8d6..46dc747349f2 100644
> > --- a/net/unix/unix_bpf.c
> > +++ b/net/unix/unix_bpf.c
> > @@ -181,6 +181,9 @@ int unix_stream_bpf_update_proto(struct sock *sk, struct sk_psock *psock, bool r
> >  	 */
> >  	if (!psock->sk_pair) {
> >  		sk_pair = unix_peer(sk);
> > +		if (WARN_ON_ONCE(!sk_pair))
> > +			return -EINVAL;
> > +
> >  		sock_hold(sk_pair);
> >  		psock->sk_pair = sk_pair;
> >  	}
> > ---8<---
> 
> Oh. That's a peculiar approach :) But, hey, it's your call.
> 
> Another AF_UNIX sockmap issue is with OOB. When OOB packet is sent, skb is
> added to recv queue, but also u->oob_skb is set. Here's the problem: when
> this skb goes through bpf_sk_redirect_map() and is moved between socks,
> oob_skb remains set on the original sock.

Good catch!

> 
> [   23.688994] WARNING: CPU: 2 PID: 993 at net/unix/garbage.c:351 unix_collect_queue+0x6c/0xb0
> [   23.689019] CPU: 2 PID: 993 Comm: kworker/u32:13 Not tainted 6.10.0-rc2+ #137
> [   23.689021] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS Arch Linux 1.16.3-1-1 04/01/2014
> [   23.689024] Workqueue: events_unbound __unix_gc
> [   23.689027] RIP: 0010:unix_collect_queue+0x6c/0xb0
> 
> I wanted to write a patch, but then I realized I'm not sure what's the
> expected behaviour. Should the oob_skb setting follow to the skb's new sock
> or should it be dropped (similarly to what is happening today with
> scm_fp_list, i.e. redirect strips inflights)?

The former will require large refactoring as we need to check if the
redirect happens for BPF_F_INGRESS and if the redirected sk is also
SOCK_STREAM etc.

So, I'd go with the latter.  Probably we can check if skb is u->oob_skb
and drop OOB data and retry next in unix_stream_read_skb(), and forbid
MSG_OOB in unix_bpf_recvmsg().

Both features were merged in 5.15 and OOB was a month later than SOCKMAP,
so the Fixes tag would be 314001f0bf927 again, where ioctl(SIOCATMARK)
(and epoll(EPOLLPRI) after d9a232d435dcc was backported to all stable)
is lying due to redirected OOB msg.

Thanks!

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ