netdev - Re: [PATCH] pkt_sched: Destroy gen estimators under rtnl

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <20080813102701.GD5367@ff.dom.local>
Date:	Wed, 13 Aug 2008 10:27:01 +0000
From:	Jarek Poplawski <jarkao2@...il.com>
To:	David Miller <davem@...emloft.net>
Cc:	herbert@...dor.apana.org.au, netdev@...r.kernel.org
Subject: Re: [PATCH] pkt_sched: Destroy gen estimators under rtnl_lock().

On Wed, Aug 13, 2008 at 02:25:49AM -0700, David Miller wrote:
> From: Jarek Poplawski <jarkao2@...il.com>
> Date: Wed, 13 Aug 2008 06:53:03 +0000
> 
> > Otherwise, I think you would better send some code example with these
> > flags, so we could be sure there is no misunderstanding around this.
> 
> Here it my concrete proposal for a fix.
> 
> pkt_sched: Fix queue quiescence testing in dev_deactivate().
> 
> Based upon discussions with Jarek P. and Herbert Xu.
> 
> First, we're testing the wrong qdisc.  We just reset the device
> queue qdiscs to &noop_qdisc and checking it's state is completely
> pointless here.
> 
> We want to wait until the previous qdisc that was sitting at
> the ->qdisc pointer is not busy any more.  And that would be
> ->qdisc_sleeping.
> 
> Because of how we propagate the samples qdisc pointer down into
> qdisc_run and friends via per-cpu ->output_queue and netif_schedule,
> we have to wait also for the __QDISC_STATE_SCHED bit to clear as
> well.

Of course, checking this needs more time, but it looks like it could
work, only two little doubts:

- in net_tx_action() we can hit a place just after clear_bit() where
none of these bits is set. Of course, hitting this 2 times in a row
seems to be very unprobable, yet possible, and a lock isn't helpful
here, so probably some change around this would make this nicer.

- isn't there possible some longer ping-pong between qdic_run() and
net_tx_action() when dev_requeue_skb() would get it back to
__netif_schedule() and so on (with NETDEV_TX_BUSY)?

Otherwise, this patch looks OK to me.

Jarek P.


> 
> Signed-off-by: David S. Miller <davem@...emloft.net>
> 
> diff --git a/net/sched/sch_generic.c b/net/sched/sch_generic.c
> index 7cf83b3..4685746 100644
> --- a/net/sched/sch_generic.c
> +++ b/net/sched/sch_generic.c
> @@ -647,7 +647,7 @@ static void dev_deactivate_queue(struct net_device *dev,
>  	}
>  }
>  
> -static bool some_qdisc_is_running(struct net_device *dev, int lock)
> +static bool some_qdisc_is_busy(struct net_device *dev, int lock)
>  {
>  	unsigned int i;
>  
> @@ -658,13 +658,14 @@ static bool some_qdisc_is_running(struct net_device *dev, int lock)
>  		int val;
>  
>  		dev_queue = netdev_get_tx_queue(dev, i);
> -		q = dev_queue->qdisc;
> +		q = dev_queue->qdisc_sleeping;
>  		root_lock = qdisc_lock(q);
>  
>  		if (lock)
>  			spin_lock_bh(root_lock);
>  
> -		val = test_bit(__QDISC_STATE_RUNNING, &q->state);
> +		val = (test_bit(__QDISC_STATE_RUNNING, &q->state) ||
> +		       test_bit(__QDISC_STATE_SCHED, &q->state));
>  
>  		if (lock)
>  			spin_unlock_bh(root_lock);
> @@ -689,14 +690,14 @@ void dev_deactivate(struct net_device *dev)
>  
>  	/* Wait for outstanding qdisc_run calls. */
>  	do {
> -		while (some_qdisc_is_running(dev, 0))
> +		while (some_qdisc_is_busy(dev, 0))
>  			yield();
>  
>  		/*
>  		 * Double-check inside queue lock to ensure that all effects
>  		 * of the queue run are visible when we return.
>  		 */
> -		running = some_qdisc_is_running(dev, 1);
> +		running = some_qdisc_is_busy(dev, 1);
>  
>  		/*
>  		 * The running flag should never be set at this point because
--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html