lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <1940b2ab-2678-45cf-bac8-9e8858a7b2ee@redhat.com>
Date: Thu, 19 Sep 2024 10:51:50 +0200
From: Paolo Abeni <pabeni@...hat.com>
To: Dmitry Antipov <dmantipov@...dex.ru>,
 John Fastabend <john.fastabend@...il.com>,
 Jakub Sitnicki <jakub@...udflare.com>
Cc: Cong Wang <xiyou.wangcong@...il.com>, Jakub Kicinski <kuba@...nel.org>,
 netdev@...r.kernel.org, lvc-project@...uxtesting.org,
 syzbot+f363afac6b0ace576f45@...kaller.appspotmail.com
Subject: Re: [PATCH net v2] net: sockmap: avoid race between
 sock_map_destroy() and sk_psock_put()

On 9/10/24 13:43, Dmitry Antipov wrote:
> Syzbot has triggered the following race condition:
> 
> On CPU0, 'sk_psock_drop()' (most likely scheduled from 'sock_map_unref()'
> called by 'sock_map_update_common()') is running at [1]:
> 
> void sk_psock_drop(struct sock *sk, struct sk_psock *psock)
> {
>          write_lock_bh(&sk->sk_callback_lock);
>          sk_psock_restore_proto(sk, psock);                              [1]
>          rcu_assign_sk_user_data(sk, NULL);                              [2]
>          ...
> }
> 
> If 'sock_map_destroy()' is scheduled on CPU1 at the same time, psock is
> always NULL at [3]. But, since [1] may be is in progress during [4], the
> value of 'saved_destroy' at this point is undefined:
> 
> void sock_map_destroy(struct sock *sk)
> {
>          void (*saved_destroy)(struct sock *sk);
>          struct sk_psock *psock;
> 
>          rcu_read_lock();
>          psock = sk_psock_get(sk);                                       [3]
>          if (unlikely(!psock)) {
>                  rcu_read_unlock();
>                  saved_destroy = READ_ONCE(sk->sk_prot)->destroy;        [4]
>          } else {
>                  saved_destroy = psock->saved_destroy;                   [5]
>                  sock_map_remove_links(sk, psock);
>                  rcu_read_unlock();
>                  sk_psock_stop(psock);
>                  sk_psock_put(sk, psock);
>          }
>          if (WARN_ON_ONCE(saved_destroy == sock_map_destroy))
>                  return;
>          if (saved_destroy)
>                  saved_destroy(sk);
> }
> 
> Fix this issue in 3 steps:
> 
> 1. Prefer 'sk_psock()' over 'sk_psock_get()' at [3]. Since zero
>     refcount is ignored, 'psock' is non-NULL until [2] is completed.
> 
> 2. Add read lock around [5], to make sure that [1] is not in progress
>     when the former is executed.
> 
> 3. Since 'sk_psock()' does not adjust reference counting, drop
>     'sk_psock_put()' and redundant 'sk_psock_stop()' (which is
>     executed by 'sk_psock_drop()' anyway).
> 
> Fixes: 5b4a79ba65a1 ("bpf, sockmap: Don't let sock_map_{close,destroy,unhash} call itself")
> Reported-by: syzbot+f363afac6b0ace576f45@...kaller.appspotmail.com
> Closes: https://syzkaller.appspot.com/bug?extid=f363afac6b0ace576f45
> Signed-off-by: Dmitry Antipov <dmantipov@...dex.ru>

I think there is still Cong question pending: why this could not/ should 
not be addressed instead in the RDS code.

Thanks,

Paolo


Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ