lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Date:	Tue, 01 Mar 2016 15:56:52 -0800
From:	Eric Dumazet <eric.dumazet@...il.com>
To:	Tom Herbert <tom@...bertland.com>
Cc:	netdev@...r.kernel.org, brakmo@...com, kernel-team@...com
Subject: Re: [PATCH RFC] net: Fix race condition when removing qdisc

On mar., 2016-03-01 at 15:16 -0800, Tom Herbert wrote:
> We are seeing a number of softlockups occurring with HTB upon removing
> the qdisc. We are still attempting to repro the exact circumstances,
> however looking at the code I'm very suspicious of this block in
> net_tx_action and its interaction with dev_deactivate (called through
> tc_modify_qdisc):
> 
>        if (!test_bit(__QDISC_STATE_DEACTIVATED,
> 		     &q->state)) {
> 	       __netif_reschedule(q);
>        } else {
> 	       smp_mb__before_atomic();
> 	       clear_bit(__QDISC_STATE_SCHED,
> 			 &q->state);
>        }
> 
> I think the following scenario could lead to badness:
> 
> 0) net_tx_action spin_trylock fails, taking non-locked block
> 1) net_tx_action checks for __QDISC_STATE_DEACTIVATED, it's not set
>    at this point
> 2) dev_deactive has lock and sets __QDISC_STATE_DEACTIVATED
> 3) dev_deactivate_many performs some_qdisc_is_busy(dev), neither
>    __QDISC_STATE_SCHED nor __QDISC_STATE_BUSY are set at this
>   point, so some_qdisc_busy fails (not seen as busy)

But __QDISC_STATE_SCHED should be set.

It is cleared only if __QDISC_STATE_DEACTIVATED has been caught in
net_tx_action()

> 4) net_tx_action sets __QDISC_STATE_SCHED

When qdisc is put again in output_queue, __QDISC_STATE_SCHED is left as
-is (set)


> 
> At this point dev_deactivate_many finishes so the qdisc may
> be freed in the tc_modify_qdisc path, however the qdisc is
> also "successfully" rescheduled to run by net_tx_action.
> 
> The propsed fix for this is to eliminate the spin_trylock in
> net_tx_action and always take the lock.


I know why you want to get rid of this trylock(), it is only the
changelog I find confusing. I do not see how it is going to help your
softlockups.




Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ