lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [day] [month] [year] [list]
Date:   Fri, 9 Jul 2021 18:21:49 -0700
From:   Andrii Nakryiko <andrii.nakryiko@...il.com>
To:     Xuan Zhuo <xuanzhuo@...ux.alibaba.com>
Cc:     Networking <netdev@...r.kernel.org>,
        "David S. Miller" <davem@...emloft.net>,
        Alexei Starovoitov <ast@...nel.org>,
        Daniel Borkmann <daniel@...earbox.net>,
        Jesper Dangaard Brouer <hawk@...nel.org>,
        John Fastabend <john.fastabend@...il.com>,
        Andrii Nakryiko <andrii@...nel.org>,
        Martin KaFai Lau <kafai@...com>,
        Song Liu <songliubraving@...com>, Yonghong Song <yhs@...com>,
        KP Singh <kpsingh@...nel.org>,
        Eric Dumazet <edumazet@...gle.com>,
        Antoine Tenart <atenart@...nel.org>,
        Alexander Lobakin <alobakin@...me>,
        Wei Wang <weiwan@...gle.com>, Taehee Yoo <ap420073@...il.com>,
        bpf <bpf@...r.kernel.org>, Abaci <abaci@...ux.alibaba.com>,
        Dust Li <dust.li@...ux.alibaba.com>,
        Jakub Kicinski <kuba@...nel.org>
Subject: Re: [PATCH net v2] xdp, net: fix use-after-free in bpf_xdp_link_release

On Fri, Jul 9, 2021 at 5:35 PM Xuan Zhuo <xuanzhuo@...ux.alibaba.com> wrote:
>
> On Fri, 9 Jul 2021 14:56:26 -0700, Andrii Nakryiko <andrii.nakryiko@...il.com> wrote:
> > On Fri, Jul 9, 2021 at 12:43 PM Jakub Kicinski <kuba@...nel.org> wrote:
> > >
> > > On Fri,  9 Jul 2021 10:55:25 +0800 Xuan Zhuo wrote:
> > > > The problem occurs between dev_get_by_index() and dev_xdp_attach_link().
> > > > At this point, dev_xdp_uninstall() is called. Then xdp link will not be
> > > > detached automatically when dev is released. But link->dev already
> > > > points to dev, when xdp link is released, dev will still be accessed,
> > > > but dev has been released.
> > > >
> > > > dev_get_by_index()        |
> > > > link->dev = dev           |
> > > >                           |      rtnl_lock()
> > > >                           |      unregister_netdevice_many()
> > > >                           |          dev_xdp_uninstall()
> > > >                           |      rtnl_unlock()
> > > > rtnl_lock();              |
> > > > dev_xdp_attach_link()     |
> > > > rtnl_unlock();            |
> > > >                           |      netdev_run_todo() // dev released
> > > > bpf_xdp_link_release()    |
> > > >     /* access dev.        |
> > > >        use-after-free */  |
> > > >
> > > > This patch adds a check of dev->reg_state in dev_xdp_attach_link(). If
> > > > dev has been called release, it will return -EINVAL.
> > >
> > > Please make sure to include a Fixes tag.
> > >
> > > I must say I prefer something closet to v1. Maybe put the if
> > > in the callee? Making ndo calls to unregistered netdevs is
> > > not legit, it will be confusing for a person reading this
> > > code to have to search callees to find where unregistered
> > > netdevs are rejected.
> >
> > So I'm a bit confused about the intended use of dev_get_by_index(). It
> > doesn't seem to be checking that device is unregistered and happily
> > returns dev with refcnt bumped even though device is going away. Is it
> > the intention that every caller of dev_get_by_index() needs to check
> > the state of the device *and* do any subsequent actions under the same
> > rtnl_lock/rtnl_unlock region? Seems a bit fragile. I suspect doing
> > this state check inside dev_get_by_index() would have unintended
> > consequences, though, right?
>
> In the function unregister_netdevice_many(), dev will be deleted from the linked
> list, so after this, dev_get_by_index() will not return dev. If it is not in
> rtnl_lock, subsequent use of dev is to check reg_state.
>

Ah, I see, makes sense, if we do dev lookup and attachment under the
same lock then we either won't get the device or at the time of
attachment it will be valid.

> So I think, maybe the version of v1 does not have the problem you mentioned.
> After calling rtnl_lock, we get dev from dev_get_by_index(). If it succeeds, we
> execute the following process, and if it fails, we return an error directly.
>
>
> >
> > BTW, seems like netlink code doesn't check the state of the device and
> > will report successful attachment to the dev that's unregistered? Is
> > this something we should fix as well?
>
> There is no such problem here, because all netlink operations are protected by
> rtnl_lock. In the protection of rtnl_lock, it is completely safe to get dev and
> attach link or prog.
>

Ok, I see, one big rtnl_lock saves netlink :)

>
> >
> > Xuan, if we do go with this approach, that dev->reg_state check should
> > probably be done in dev_xdp_attach() instead, which is called for both
> > bpf_link-based and bpf_prog-based XDP attachment.
>
> As mentioned above, since the entire bpf prog operation is protected by
> rtnl_lock, dev_xdp_attach() does not need to check the status of dev.
>
> >
> > If not, then the cleanest solution would be to make this check right
> > before dev_xdp_attach_link (though it's not clear what are we gaining
> > with that, if we ever have another user of dev_xdp_attach_link beside
> > bpf_xdp_link_attach, we'll probably miss similar situation), instead
> > of spreading out rtnl_unlock.
> >
> > BTW, regardless of the approach, we still need to do link->dev = NULL
> > if dev_xdp_attach_link() errors out.
>
> I think I understand what you mean now.

Yeah, this is a problem regardless.

Btw, I was also thinking to move dev_get_by_index right before
dev_xdp_attach_link inside a tight rntl_lock/rtnl_unlock region after
bpf_link is allocated, but that seems pretty bad if user,
intentionally or not, passes wrong ifindex. We'll be allocated a bunch
of unnecessary memory and deferring freeing it for no good reason. So
let's go with your v1 and link->dev = NULL to cover the clean up bug.
Thanks!

>
> Thanks.
>
> >
> >
> > >
> > > > Reported-by: Abaci <abaci@...ux.alibaba.com>
> > > > Signed-off-by: Xuan Zhuo <xuanzhuo@...ux.alibaba.com>
> > > > Reviewed-by: Dust Li <dust.li@...ux.alibaba.com>
> > >
> > > > diff --git a/net/core/dev.c b/net/core/dev.c
> > > > index c253c2aafe97..63c9a46ca853 100644
> > > > --- a/net/core/dev.c
> > > > +++ b/net/core/dev.c
> > > > @@ -9544,6 +9544,10 @@ static int dev_xdp_attach_link(struct net_device *dev,
> > > >                              struct netlink_ext_ack *extack,
> > > >                              struct bpf_xdp_link *link)
> > > >  {
> > > > +     /* ensure the dev state is ok */
> > > > +     if (dev->reg_state != NETREG_REGISTERED)
> > > > +             return -EINVAL;
> > > > +
> > > >       return dev_xdp_attach(dev, extack, link, NULL, NULL, link->flags);
> > > >  }

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ