netdev - Re: [NET]: Fix possible dev_deactivate race condition

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-Id: <1192779319.27435.163.camel@twins>
Date:	Fri, 19 Oct 2007 09:35:19 +0200
From:	Peter Zijlstra <peterz@...radead.org>
To:	Herbert Xu <herbert@...dor.apana.org.au>
Cc:	"David S. Miller" <davem@...emloft.net>,
	Benjamin Herrenschmidt <benh@...nel.crashing.org>,
	akpm@...ux-foundation.org,
	Linux Kernel Mailing List <linux-kernel@...r.kernel.org>,
	linuxppc-dev@...abs.org, Ingo Molnar <mingo@...e.hu>,
	Thomas Gleixner <tglx@...utronix.de>,
	Linus Torvalds <torvalds@...ux-foundation.org>,
	netdev@...r.kernel.org, Ingo Molnar <mingo@...e.hu>
Subject: Re: [NET]: Fix possible dev_deactivate race condition

On Fri, 2007-10-19 at 13:36 +0800, Herbert Xu wrote:
> On Fri, Oct 19, 2007 at 12:20:25PM +0800, Herbert Xu wrote:
> >
> > In fact this bug exists elsewhere too.  For example, the network
> > stack does this in net/sched/sch_generic.c:
> > 
> >         /* Wait for outstanding qdisc_run calls. */
> >       while (test_bit(__LINK_STATE_QDISC_RUNNING, &dev->state))
> >               yield();
> > 
> > This has the same problem as the current synchronize_irq code.
> 

> diff --git a/net/sched/sch_generic.c b/net/sched/sch_generic.c
> index e01d576..b3b7420 100644
> --- a/net/sched/sch_generic.c
> +++ b/net/sched/sch_generic.c
> @@ -556,6 +556,7 @@ void dev_deactivate(struct net_device *dev)
>  {
>         struct Qdisc *qdisc;
>         struct sk_buff *skb;
> +       int running;
>  
>         spin_lock_bh(&dev->queue_lock);
>         qdisc = dev->qdisc;
> @@ -571,12 +572,31 @@ void dev_deactivate(struct net_device *dev)
>  
>         dev_watchdog_down(dev);
>  
> -       /* Wait for outstanding dev_queue_xmit calls. */
> +       /* Wait for outstanding qdisc-less dev_queue_xmit calls. */
>         synchronize_rcu();
>  
>         /* Wait for outstanding qdisc_run calls. */
> -       while (test_bit(__LINK_STATE_QDISC_RUNNING, &dev->state))
> -               yield();
> +       do {
> +               while (test_bit(__LINK_STATE_QDISC_RUNNING, &dev->state))
> +                       yield();
> +

Ouch!, is there really no sane locking alternative? Hashed waitqueues
like for the page lock come to mind.

> +               /*
> +                * Double-check inside queue lock to ensure that all effects
> +                * of the queue run are visible when we return.
> +                */
> +               spin_lock_bh(&dev->queue_lock);
> +               running = test_bit(__LINK_STATE_QDISC_RUNNING, &dev->state);
> +               spin_unlock_bh(&dev->queue_lock);
> +
> +               /*
> +                * The running flag should never be set at this point because
> +                * we've already set dev->qdisc to noop_qdisc *inside* the same
> +                * pair of spin locks.  That is, if any qdisc_run starts after
> +                * our initial test it should see the noop_qdisc and then
> +                * clear the RUNNING bit before dropping the queue lock.  So
> +                * if it is set here then we've found a bug.
> +                */
> +       } while (WARN_ON_ONCE(running));
>  }
>  
>  void dev_init_scheduler(struct net_device *dev) 

-
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html