[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <CAG48ez2wyQwc5XMKKw8835-4t6+x=X3kPY_CPUqZeh=xQ2krqQ@mail.gmail.com>
Date: Fri, 28 Jan 2022 03:48:15 +0100
From: Jann Horn <jannh@...gle.com>
To: Eric Dumazet <edumazet@...gle.com>
Cc: "David S . Miller" <davem@...emloft.net>,
Jakub Kicinski <kuba@...nel.org>,
netdev <netdev@...r.kernel.org>,
LKML <linux-kernel@...r.kernel.org>,
Oliver Neukum <oneukum@...e.com>
Subject: Re: [PATCH net] net: dev: Detect dev_hold() after netdev_wait_allrefs()
On Fri, Jan 28, 2022 at 3:25 AM Eric Dumazet <edumazet@...gle.com> wrote:
> On Thu, Jan 27, 2022 at 6:14 PM Jann Horn <jannh@...gle.com> wrote:
> > On Fri, Jan 28, 2022 at 3:09 AM Eric Dumazet <edumazet@...gle.com> wrote:
> > > On Thu, Jan 27, 2022 at 5:43 PM Jann Horn <jannh@...gle.com> wrote:
> > > >
> > > > I've run into a bug where dev_hold() was being called after
> > > > netdev_wait_allrefs(). But at that point, the device is already going
> > > > away, and dev_hold() can't stop that anymore.
> > > >
> > > > To make such problems easier to diagnose in the future:
> > > >
> > > > - For CONFIG_PCPU_DEV_REFCNT builds: Recheck in free_netdev() whether
> > > > the net refcount has been elevated. If this is detected, WARN() and
> > > > leak the object (to prevent worse consequences from a
> > > > use-after-free).
> > > > - For builds without CONFIG_PCPU_DEV_REFCNT: Set the refcount to zero.
> > > > This signals to the generic refcount infrastructure that any attempt
> > > > to increment the refcount later is a bug.
> > > >
> > > > Signed-off-by: Jann Horn <jannh@...gle.com>
> > > > ---
> > > > net/core/dev.c | 18 +++++++++++++++++-
> > > > 1 file changed, 17 insertions(+), 1 deletion(-)
> > > >
> > > > diff --git a/net/core/dev.c b/net/core/dev.c
> > > > index 1baab07820f6..f7916c0d226d 100644
> > > > --- a/net/core/dev.c
> > > > +++ b/net/core/dev.c
> > > > @@ -9949,8 +9949,18 @@ void netdev_run_todo(void)
> > > >
> > > > netdev_wait_allrefs(dev);
> > > >
> > > > + /* Drop the netdev refcount (which should be 1 at this point)
> > > > + * to zero. If we're using the generic refcount code, this will
> > > > + * tell it that any dev_hold() after this point is a bug.
> > > > + */
> > > > +#ifdef CONFIG_PCPU_DEV_REFCNT
> > > > + this_cpu_dec(*dev->pcpu_refcnt);
> > > > + BUG_ON(netdev_refcnt_read(dev) != 0);
> > > > +#else
> > > > + BUG_ON(!refcount_dec_and_test(&dev->dev_refcnt));
> > > > +#endif
> > > > +
> > > > /* paranoia */
> > > > - BUG_ON(netdev_refcnt_read(dev) != 1);
> > > > BUG_ON(!list_empty(&dev->ptype_all));
> > > > BUG_ON(!list_empty(&dev->ptype_specific));
> > > > WARN_ON(rcu_access_pointer(dev->ip_ptr));
> > > > @@ -10293,6 +10303,12 @@ void free_netdev(struct net_device *dev)
> > > > free_percpu(dev->xdp_bulkq);
> > > > dev->xdp_bulkq = NULL;
> > > >
> > > > + /* Recheck in case someone called dev_hold() between
> > > > + * netdev_wait_allrefs() and here.
> > > > + */
> > >
> > > At this point, dev->pcpu_refcnt per-cpu data has been freed already
> > > (CONFIG_PCPU_DEV_REFCNT=y)
> > >
> > > So this should probably crash, or at least UAF ?
> >
> > Oh. Whoops. That's what I get for only testing without CONFIG_PCPU_DEV_REFCNT...
> >
> > I guess a better place to put the new check would be directly after
> > checking for "dev->reg_state == NETREG_UNREGISTERING"? Like this?
> >
> > if (dev->reg_state == NETREG_UNREGISTERING) {
> > ASSERT_RTNL();
> > dev->needs_free_netdev = true;
> > return;
> > }
> >
> > /* Recheck in case someone called dev_hold() between
> > * netdev_wait_allrefs() and here.
> > */
> > if (WARN_ON(netdev_refcnt_read(dev) != 0))
> > return; /* leak memory, otherwise we might get UAF */
> >
> > netif_free_tx_queues(dev);
> > netif_free_rx_queues(dev);
>
> Maybe another solution would be to leverage the recent dev_hold_track().
>
> We could add a dead boolean to 'struct ref_tracker_dir ' (dev->refcnt_tracker)
>
[...]
> @@ -72,6 +73,8 @@ int ref_tracker_alloc(struct ref_tracker_dir *dir,
> gfp_t gfp_mask = gfp;
> unsigned long flags;
>
> + WARN_ON_ONCE(dir->dead);
When someone is using NET_DEV_REFCNT_TRACKER for slow debugging, they
should also be able to take the performance hit of
CONFIG_PCPU_DEV_REFCNT and rely on the normal increment-from-zero
detection of the generic refcount code, right? (Maybe
NET_DEV_REFCNT_TRACKER should depend on !CONFIG_PCPU_DEV_REFCNT?)
My intent with the extra check in free_netdev() was to get some
limited detection for production systems that don't use
NET_DEV_REFCNT_TRACKER.
Powered by blists - more mailing lists