netdev - Re: [PATCH net-next] xfrm: add missing rcu verbs to fix data-race

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Message-ID: <CANn89iKjMD9r0+8CGdAOFpm3jPYtUsymXaDeG2US2WRzFicueg@mail.gmail.com>
Date:   Fri, 8 Nov 2019 18:51:46 -0800
From:   Eric Dumazet <edumazet@...gle.com>
To:     Herbert Xu <herbert@...dor.apana.org.au>
Cc:     "David S . Miller" <davem@...emloft.net>,
        netdev <netdev@...r.kernel.org>,
        Eric Dumazet <eric.dumazet@...il.com>,
        syzbot <syzkaller@...glegroups.com>,
        Steffen Klassert <steffen.klassert@...unet.com>,
        Dmitry Vyukov <dvyukov@...gle.com>,
        Linus Torvalds <torvalds@...ux-foundation.org>,
        "Paul E. McKenney" <paulmck@...nel.org>,
        Will Deacon <will@...nel.org>
Subject: Re: [PATCH net-next] xfrm: add missing rcu verbs to fix data-race

On Fri, Nov 8, 2019 at 5:33 PM Herbert Xu <herbert@...dor.apana.org.au> wrote:
>
> On Thu, Nov 07, 2019 at 07:47:01PM -0800, Eric Dumazet wrote:
> > KCSAN reported a data-race in xfrm_lookup_with_ifid() and
> > xfrm_sk_free_policy() [1]
>
> I'm very uncomfortable with these warnings being enabled in KASAN
> unless there is a way to opt out of them without adding unnecessary
> READ_ONCE/WRITE_ONCE tags.
>
> All they do is create patches such as this one that simply adds
> these tags without resolving the underlying issues (if there are
> any).
>
> > diff --git a/include/net/xfrm.h b/include/net/xfrm.h
> > index aa08a7a5f6ac5836524dd34115cd57e2675e574d..8884575ae2135b739a2c316bf8a92b56d6cc807c 100644
> > --- a/include/net/xfrm.h
> > +++ b/include/net/xfrm.h
> > @@ -1093,7 +1093,7 @@ static inline int __xfrm_policy_check2(struct sock *sk, int dir,
> >       struct net *net = dev_net(skb->dev);
> >       int ndir = dir | (reverse ? XFRM_POLICY_MASK + 1 : 0);
> >
> > -     if (sk && sk->sk_policy[XFRM_POLICY_IN])
> > +     if (sk && rcu_access_pointer(sk->sk_policy[XFRM_POLICY_IN]))
> >               return __xfrm_policy_check(sk, ndir, skb, family);
>
> This is simply an optimisation and we don't care if we get it
> wrong due to the lack of READ_ONCE/WRITE_ONCE.  Even with the
> READ_ONCE tag, there is nothing stopping a policy from being
> added after the test returns false, or all policies from being
> deleted after the test returns true.
>
> IOW this is simply unnecessary.

How do you suggest to silence the KCSAN warnings then ?

>
> > @@ -1171,7 +1171,8 @@ static inline int xfrm_sk_clone_policy(struct sock *sk, const struct sock *osk)
> >  {
> >       sk->sk_policy[0] = NULL;
> >       sk->sk_policy[1] = NULL;
> > -     if (unlikely(osk->sk_policy[0] || osk->sk_policy[1]))
> > +     if (unlikely(rcu_access_pointer(osk->sk_policy[0]) ||
> > +                  rcu_access_pointer(osk->sk_policy[1])))
> >               return __xfrm_sk_clone_policy(sk, osk);
> >       return 0;
>
> These on the other hand are done under socket lock.  IOW they
> are completely synchronised with respect to the write side which
> is also under socket lock so no tagging is necessary.
>
> Incidentally rcu_access_pointer is now practically the same as
> rcu_dereference because the smp_read_barrier_depends has been
> moved over to READ_ONCE.  In fact there is no performance difference
> between rcu_dereference and rcu_dereference_raw either.
>
> I think we should either remove the smp_barrier_depends from
> rcu_access_pointer, or just get rid of rcu_access_pointer completely.
>
> > @@ -1185,12 +1186,12 @@ static inline void xfrm_sk_free_policy(struct sock *sk)
> >       pol = rcu_dereference_protected(sk->sk_policy[0], 1);
> >       if (unlikely(pol != NULL)) {
> >               xfrm_policy_delete(pol, XFRM_POLICY_MAX);
> > -             sk->sk_policy[0] = NULL;
> > +             rcu_assign_pointer(sk->sk_policy[0], NULL);
> >       }
> >       pol = rcu_dereference_protected(sk->sk_policy[1], 1);
> >       if (unlikely(pol != NULL)) {
> >               xfrm_policy_delete(pol, XFRM_POLICY_MAX+1);
> > -             sk->sk_policy[1] = NULL;
> > +             rcu_assign_pointer(sk->sk_policy[1], NULL);
>
> These should use RCU_INIT_POINTER.

This does not matter anymore.
Please double check rcu_assign_pointer(), it handles the NULL case.

commit 3a37f7275cda5ad25c1fe9be8f20c76c60d175fa
Author: Paul E. McKenney <paulmck@...ux.vnet.ibm.com>
Date:   Sun May 1 18:46:54 2016 -0700

    rcu: No ordering for rcu_assign_pointer() of NULL

    This commit does a compile-time check for rcu_assign_pointer() of NULL,
    and uses WRITE_ONCE() rather than smp_store_release() in that case.

    Reported-by: Christoph Hellwig <hch@...radead.org>
    Signed-off-by: Paul E. McKenney <paulmck@...ux.vnet.ibm.com>


>
> > diff --git a/net/smc/smc.h b/net/smc/smc.h
> > index be11ba41190fb58be3ce9e8ab1a9ea4f8aa6a05b..4324dd39de99ba5967e1325746a2f5eff4baf2e7 100644
> > --- a/net/smc/smc.h
> > +++ b/net/smc/smc.h
> > @@ -253,8 +253,8 @@ static inline u32 ntoh24(u8 *net)
> >  #ifdef CONFIG_XFRM
> >  static inline bool using_ipsec(struct smc_sock *smc)
> >  {
> > -     return (smc->clcsock->sk->sk_policy[0] ||
> > -             smc->clcsock->sk->sk_policy[1]) ? true : false;
> > +     return (rcu_access_pointer(smc->clcsock->sk->sk_policy[0]) ||
> > +             rcu_access_pointer(smc->clcsock->sk->sk_policy[1])) ? true : false;
> >  }
> >  #else
> >  static inline bool using_ipsec(struct smc_sock *smc)
>
> Now this could actually be a real bug because it doesn't appear
> to be an optimisation and it doesn't seem to be holding the socket
> lock either.
>
> Again this shows that we shouldn't just add READ_ONCE/WRITE_ONCE
> tags because they might be papering over real bugs and help them
> hide better because people will automatically assume that using
> READ_ONCE/WRITE_ONCE means that the code is *safe*.
>

Hmm, I will leave to you the resolution of this bug.

Thanks.