lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <ba5c50aa-1df4-40c2-ab33-a72022c5a32e@rbox.co>
Date: Sun, 9 Jun 2024 13:28:34 +0200
From: Michal Luczaj <mhal@...x.co>
To: Kuniyuki Iwashima <kuniyu@...zon.com>,
 "David S. Miller" <davem@...emloft.net>, Eric Dumazet <edumazet@...gle.com>,
 Jakub Kicinski <kuba@...nel.org>, Paolo Abeni <pabeni@...hat.com>
Cc: Kuniyuki Iwashima <kuni1840@...il.com>, netdev@...r.kernel.org,
 Cong Wang <cong.wang@...edance.com>
Subject: Re: [PATCH v2 net 01/15] af_unix: Set sk->sk_state under
 unix_state_lock() for truly disconencted peer.

On 6/4/24 18:52, Kuniyuki Iwashima wrote:
> When a SOCK_DGRAM socket connect()s to another socket, the both sockets'
> sk->sk_state are changed to TCP_ESTABLISHED so that we can register them
> to BPF SOCKMAP. (...)

Speaking of af_unix and sockmap, SOCK_STREAM has a tiny window for
bpf(BPF_MAP_UPDATE_ELEM) and unix_stream_connect() to race: when
sock_map_sk_state_allowed() passes (sk_state == TCP_ESTABLISHED), but
unix_peer(sk) in unix_stream_bpf_update_proto() _still_ returns NULL:

	T0 bpf				T1 connect
	======				==========

				WRITE_ONCE(sk->sk_state, TCP_ESTABLISHED)
sock_map_sk_state_allowed(sk)
...
sk_pair = unix_peer(sk)
sock_hold(sk_pair)
				sock_hold(newsk)
				smp_mb__after_atomic()
				unix_peer(sk) = newsk
				unix_state_unlock(sk)

With mdelay(1) stuffed in unix_stream_connect():

[  902.277593] BUG: kernel NULL pointer dereference, address: 0000000000000080
[  902.277633] #PF: supervisor write access in kernel mode
[  902.277661] #PF: error_code(0x0002) - not-present page
[  902.277688] PGD 107191067 P4D 107191067 PUD 10f63c067 PMD 0
[  902.277716] Oops: Oops: 0002 [#23] PREEMPT SMP NOPTI
[  902.277742] CPU: 2 PID: 1505 Comm: a.out Tainted: G      D            6.10.0-rc1+ #130
[  902.277769] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS Arch Linux 1.16.3-1-1 04/01/2014
[  902.277793] RIP: 0010:unix_stream_bpf_update_proto+0xa1/0x150

Setting TCP_ESTABLISHED _after_ unix_peer() fixes the issue, so how about
something like

@@ -1631,12 +1631,13 @@ static int unix_stream_connect(struct socket *sock, struct sockaddr *uaddr,
        /* Set credentials */
        copy_peercred(sk, other);

-       sock->state     = SS_CONNECTED;
-       WRITE_ONCE(sk->sk_state, TCP_ESTABLISHED);
        sock_hold(newsk);
+       smp_mb__after_atomic(); /* sock_hold() does an atomic_inc() */
+       WRITE_ONCE(unix_peer(sk), newsk);
+       smp_wmb(); /* ensure peer is set before sk_state */

-       smp_mb__after_atomic(); /* sock_hold() does an atomic_inc() */
-       unix_peer(sk)   = newsk;
+       sock->state = SS_CONNECTED;
+       WRITE_ONCE(sk->sk_state, TCP_ESTABLISHED);

        unix_state_unlock(sk);

@@ -180,7 +180,8 @@ int unix_stream_bpf_update_proto(struct sock *sk, struct sk_psock *psock, bool r
         * be a single matching destroy operation.
         */
        if (!psock->sk_pair) {
-               sk_pair = unix_peer(sk);
+               smp_rmb();
+               sk_pair = READ_ONCE(unix_peer(sk));
                sock_hold(sk_pair);
                psock->sk_pair = sk_pair;
        }

This should keep things ordered and lockless... I hope.

Alternatively, maybe it would be better just to make BPF respect the unix
state lock?

@@ -180,6 +180,8 @@ int unix_stream_bpf_update_proto(struct sock *sk, struct sk_psock *psock, bool r
 	 * be a single matching destroy operation.
 	 */
 	if (!psock->sk_pair) {
+               unix_state_lock(sk);
                sk_pair = unix_peer(sk);
+               unix_state_unlock(sk);
 		sock_hold(sk_pair);
 		psock->sk_pair = sk_pair;

What do you think?

Thanks,
Michal

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ