[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <1300949637.2810.75.camel@edumazet-laptop>
Date: Thu, 24 Mar 2011 07:53:57 +0100
From: Eric Dumazet <eric.dumazet@...il.com>
To: Stephen Hemminger <shemminger@...tta.com>,
David Miller <davem@...emloft.net>
Cc: Fabio Checconi <fchecconi@...il.com>,
netdev <netdev@...r.kernel.org>
Subject: [PATCH] net_sched: fix THROTTLED/RUNNING race
Le mercredi 23 mars 2011 à 07:45 +0100, Eric Dumazet a écrit :
> While polishing QFQ scheduler, and tracking a bug in it, I finally
> replaced in my tc scripts "QFQ experimental" by "SFQ rock solid" and
> found I could have a hang in some situations :(
>
> I made a bisection and found :
>
> # git bisect good
> 7a6362800cb7d1d618a697a650c7aaed3eb39320 is the first bad commit
>
> It seems I am stuck...
>
> # git bisect log
> git bisect start
> # bad: [c360d5b53a7fec44ae4402e1f13fc888f57ddc3b] Merge branch 'master'
> of git://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux-2.6
> git bisect bad c360d5b53a7fec44ae4402e1f13fc888f57ddc3b
> # good: [07a2039b8eb0af4ff464efd3dfd95de5c02648c6] Linux 2.6.30
> git bisect good 07a2039b8eb0af4ff464efd3dfd95de5c02648c6
> # good: [2ec8c6bb5d8f3a62a79f463525054bae1e3d4487] Merge branch 'master'
> of /home/davem/src/GIT/linux-2.6/
> git bisect good 2ec8c6bb5d8f3a62a79f463525054bae1e3d4487
> # good: [e0e170bd7ded2ec16e2813d63c0faff43193fde8] Merge branch 'next'
> of git://git.monstr.eu/linux-2.6-microblaze
> git bisect good e0e170bd7ded2ec16e2813d63c0faff43193fde8
> # good: [40c73abbb37e399eba274fe49e520ffa3dd65bdb] Merge branch
> 'for_linus' of
> git://git.kernel.org/pub/scm/linux/kernel/git/jack/linux-fs-2.6
> git bisect good 40c73abbb37e399eba274fe49e520ffa3dd65bdb
> # good: [4b66fef9b591b95f447aea12242a1133deb0bd22] mcast: net_device dev
> not used
> git bisect good 4b66fef9b591b95f447aea12242a1133deb0bd22
> # bad: [7a6362800cb7d1d618a697a650c7aaed3eb39320] Merge
> git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-next-2.6
> git bisect bad 7a6362800cb7d1d618a697a650c7aaed3eb39320
> # good: [971f115a50afbe409825c9f3399d5a3b9aca4381] Merge branch
> 'usb-next' of
> git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/usb-2.6
> git bisect good 971f115a50afbe409825c9f3399d5a3b9aca4381
> # good: [5917def58ab9f5848f2d1da835a33a490d0c8c69] staging/easycap:
> reduce code nesting in easycap_sound.c
> git bisect good 5917def58ab9f5848f2d1da835a33a490d0c8c69
> # good: [6445ced8670f37cfc2c5e24a9de9b413dbfc788d] Merge branch
> 'staging-next' of
> git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/staging-2.6
> git bisect good 6445ced8670f37cfc2c5e24a9de9b413dbfc788d
> # good: [da91981bee8de20bcd06ee0dbddd53d62d23b1bd] ipv4: Use flowi4 in
> ipmr code.
> git bisect good da91981bee8de20bcd06ee0dbddd53d62d23b1bd
> # good: [106af2c99a5249b809aaed45b8353ac087821f4a] Merge branch 'master'
> of
> git://git.kernel.org/pub/scm/linux/kernel/git/linville/wireless-next-2.6
> into for-davem
> git bisect good 106af2c99a5249b809aaed45b8353ac087821f4a
> # good: [638be344593b66ccca6802c6076a5b3d9200829d] Phonet: fix
> aligned-mode pipe socket buffer header reserve
> git bisect good 638be344593b66ccca6802c6076a5b3d9200829d
> # good: [c337ffb68e1e71bad069b14d2246fa1e0c31699c] Merge branch 'master'
> of master.kernel.org:/pub/scm/linux/kernel/git/davem/net-2.6
> git bisect good c337ffb68e1e71bad069b14d2246fa1e0c31699c
> # good: [ee0caa79569a9c44febc18480beef4847aa8cecd] Merge branch 'master'
> of git://git.kernel.org/pub/scm/linux/kernel/git/kaber/nf-next-2.6
> git bisect good ee0caa79569a9c44febc18480beef4847aa8cecd
> # good: [0bd80dad57d82676ee484fb1f9aa4c5e8b5bc469] net: get rid of
> multiple bond-related netdevice->priv_flags
> git bisect good 0bd80dad57d82676ee484fb1f9aa4c5e8b5bc469
> # good: [8a4eb5734e8d1dc60a8c28576bbbdfdcc643626d] net: introduce
> rx_handler results and logic around that
> git bisect good 8a4eb5734e8d1dc60a8c28576bbbdfdcc643626d
> # good: [ceda86a108671294052cbf51660097b6534672f5] bonding: enable
> netpoll without checking link status
> git bisect good ceda86a108671294052cbf51660097b6534672f5
>
> Any idea how we can find the problem ?
>
> Script to reproduce the problem :
>
>
> modprobe dummy
>
> ifconfig dummy0 10.2.2.254 netmask 255.255.255.0 up
>
> for i in `seq 1 240`
> do
> arp -H ether -i dummy0 -s 10.2.2.$i 00:00:0c:07:ac:$(printf %02x $i)
> done
>
>
> DEV=dummy0
> RATE="rate 40Mbit"
> TNETS="10.2.2.0/25"
> ALLOT="allot 20000"
>
>
> tc qdisc del dev dummy0 root 2>/dev/null
>
>
> tc qdisc add dev $DEV root handle 1: est 1sec 8sec cbq avpkt 1000 rate 100Mbit \
> bandwidth 100Mbit
> tc class add dev $DEV parent 1: classid 1:1 \
> est 1sec 8sec cbq allot 10000 mpu 64 \
> rate 100Mbit prio 1 avpkt 1500 bounded
>
> # output to test nets : 40 Mbit limit
> tc class add dev $DEV parent 1:1 classid 1:11 \
> est 1sec 8sec cbq $ALLOT mpu 64 \
> $RATE prio 2 avpkt 1400 bounded
>
> tc qdisc add dev $DEV parent 1:11 handle 11: \
> est 1sec 8sec sfq
>
>
> for privnet in $TNETS
> do
> tc filter add dev $DEV parent 1: protocol ip prio 100 u32 \
> match ip dst $privnet flowid 1:11
> done
>
> tc filter add dev $DEV parent 1: protocol ip prio 100 u32 \
> match ip protocol 0 0x00 flowid 1:1
>
>
> iperf -u -c 10.2.2.1 -P 32 -l 50
> iperf -u -c 10.2.2.1 -P 32 -l 50
> iperf -u -c 10.2.2.1 -P 32 -l 50
> tc -s -d qdisc show dev dummy0
>
Okay... this bug was hard to find.
David, would you be OK if we send QFQ patch for 2.6.39 ?
I know its a bit late but its a new qdisc, very well tested now, just
tell us :)
Thanks
[PATCH] net_sched: fix THROTTLED/RUNNING race
commit fd245a4adb52 (net_sched: move TCQ_F_THROTTLED flag)
added a race.
qdisc_watchdog() is run from softirq, so special care should be taken or
we can lose one state transition (THROTTLED/RUNNING)
Prior to fd245a4adb52, we were manipulating q->flags (qdisc->flags &=
~TCQ_F_THROTTLED;) and this manipulation could only race with
qdisc_warn_nonwc().
Since we want to avoid atomic ops in qdisc fast path - it was the
meaning of commit 371121057607e (QDISC_STATE_RUNNING dont need atomic
bit ops) - fix is to move THROTTLE bit into 'state' field, this one
being manipulated with SMP and IRQ safe operations.
Signed-off-by: Eric Dumazet <eric.dumazet@...il.com>
Cc: Stephen Hemminger <shemminger@...tta.com>
Cc: Fabio Checconi <fchecconi@...il.com>
---
include/net/sch_generic.h | 8 ++++----
1 file changed, 4 insertions(+), 4 deletions(-)
diff --git a/include/net/sch_generic.h b/include/net/sch_generic.h
index a9505b6..b931f02 100644
--- a/include/net/sch_generic.h
+++ b/include/net/sch_generic.h
@@ -25,6 +25,7 @@ struct qdisc_rate_table {
enum qdisc_state_t {
__QDISC_STATE_SCHED,
__QDISC_STATE_DEACTIVATED,
+ __QDISC_STATE_THROTTLED,
};
/*
@@ -32,7 +33,6 @@ enum qdisc_state_t {
*/
enum qdisc___state_t {
__QDISC___STATE_RUNNING = 1,
- __QDISC___STATE_THROTTLED = 2,
};
struct qdisc_size_table {
@@ -106,17 +106,17 @@ static inline void qdisc_run_end(struct Qdisc *qdisc)
static inline bool qdisc_is_throttled(const struct Qdisc *qdisc)
{
- return (qdisc->__state & __QDISC___STATE_THROTTLED) ? true : false;
+ return test_bit(__QDISC_STATE_THROTTLED, &qdisc->state) ? true : false;
}
static inline void qdisc_throttled(struct Qdisc *qdisc)
{
- qdisc->__state |= __QDISC___STATE_THROTTLED;
+ set_bit(__QDISC_STATE_THROTTLED, &qdisc->state);
}
static inline void qdisc_unthrottled(struct Qdisc *qdisc)
{
- qdisc->__state &= ~__QDISC___STATE_THROTTLED;
+ clear_bit(__QDISC_STATE_THROTTLED, &qdisc->state);
}
struct Qdisc_class_ops {
--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Powered by blists - more mailing lists