[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20070509071614.GA2138@gondor.apana.org.au>
Date: Wed, 9 May 2007 17:16:15 +1000
From: Herbert Xu <herbert@...dor.apana.org.au>
To: Andrew Morton <akpm@...ux-foundation.org>
Cc: "David S. Miller" <davem@...emloft.net>, netdev@...r.kernel.org
Subject: Re: linkwatch bustage in git-net
On Tue, May 08, 2007 at 11:11:43PM -0700, Andrew Morton wrote:
>
> > [NET] link_watch: Eliminate potential delay on wrap-around
>
> hm, that fixed it. Do we know why? ;)
Yep :)
> btw, looking at the code:
>
> clear_bit(LW_RUNNING, &linkwatch_flags);
>
> spin_lock_irq(&lweventlist_lock);
> next = lweventlist;
> lweventlist = NULL;
> spin_unlock_irq(&lweventlist_lock);
>
> while (next) {
> struct net_device *dev = next;
>
> next = dev->link_watch_next;
>
> lweventlist_lock protects lweventlist and every netdev's ->link_watch_next.
> But this code is walking that singly-linked list outside the lock and
> after clearing LW_RUNNING. What stops this singly-linked list from getting
> altered by another thread of control while this code is traversing it?
It's protected by the __LINK_STATE_LINKWATCH_PENDING bit.
The real problem here is a combination of factors. First of all e100
does a netif_carrier_off in its open routine *before* the first call
to dev_activate. This when combined with the wrap-around bug causes
an infinitely delayed event for your NIC to be installed.
When the carrier comes up it notices that we already have an event
queued for that NIC so it just drops it even though it's urgent.
The previous bug fix limits this delay to one second but it is still
sub-optimal.
So here is the additional fix.
[NET] link_watch: Always schedule urgent events
Urgent events may be delayed if we already have a non-urgent event
queued for that device. This patch changes this by making sure that
an urgent event is always looked at immediately.
I've replaced the LW_RUNNING flag by LW_URGENT since whether work
is scheduled is already kept track by the work queue system.
The only complication is that we have to provide some exclusion for
the setting linkwatch_nextevent which is available in the actual
work function.
Signed-off-by: Herbert Xu <herbert@...dor.apana.org.au>
Cheers,
--
Visit Openswan at http://www.openswan.org/
Email: Herbert Xu ~{PmV>HI~} <herbert@...dor.apana.org.au>
Home Page: http://gondor.apana.org.au/~herbert/
PGP Key: http://gondor.apana.org.au/~herbert/pubkey.txt
--
diff --git a/net/core/link_watch.c b/net/core/link_watch.c
index 4674ae5..a5e372b 100644
--- a/net/core/link_watch.c
+++ b/net/core/link_watch.c
@@ -26,7 +26,7 @@
enum lw_bits {
- LW_RUNNING = 0,
+ LW_URGENT = 0,
};
static unsigned long linkwatch_flags;
@@ -95,18 +95,41 @@ static void linkwatch_add_event(struct net_device *dev)
}
-static void linkwatch_schedule_work(unsigned long delay)
+static void linkwatch_schedule_work(int urgent)
{
- if (test_and_set_bit(LW_RUNNING, &linkwatch_flags))
+ unsigned long delay = linkwatch_nextevent - jiffies;
+
+ if (test_bit(LW_URGENT, &linkwatch_flags))
return;
- /* If we wrap around we'll delay it by at most HZ. */
- if (delay > HZ) {
- linkwatch_nextevent = jiffies;
+ /* Minimise down-time: drop delay for up event. */
+ if (urgent) {
+ if (test_and_set_bit(LW_URGENT, &linkwatch_flags))
+ return;
delay = 0;
}
- schedule_delayed_work(&linkwatch_work, delay);
+ /* If we wrap around we'll delay it by at most HZ. */
+ if (delay > HZ)
+ delay = 0;
+
+ /*
+ * This is true if we've scheduled it immeditately or if we don't
+ * need an immediate execution and it's already pending.
+ */
+ if (schedule_delayed_work(&linkwatch_work, delay) == !delay)
+ return;
+
+ /* Don't bother if there is nothing urgent. */
+ if (!test_bit(LW_URGENT, &linkwatch_flags))
+ return;
+
+ /* It's already running which is good enough. */
+ if (!cancel_delayed_work(&linkwatch_work))
+ return;
+
+ /* Otherwise we reschedule it again for immediate exection. */
+ schedule_delayed_work(&linkwatch_work, 0);
}
@@ -123,7 +146,11 @@ static void __linkwatch_run_queue(int urgent_only)
*/
if (!urgent_only)
linkwatch_nextevent = jiffies + HZ;
- clear_bit(LW_RUNNING, &linkwatch_flags);
+ /* Limit wrap-around effect on delay. */
+ else if (time_after(linkwatch_nextevent, jiffies + HZ))
+ linkwatch_nextevent = jiffies;
+
+ clear_bit(LW_URGENT, &linkwatch_flags);
spin_lock_irq(&lweventlist_lock);
next = lweventlist;
@@ -166,7 +193,7 @@ static void __linkwatch_run_queue(int urgent_only)
}
if (lweventlist)
- linkwatch_schedule_work(linkwatch_nextevent - jiffies);
+ linkwatch_schedule_work(0);
}
@@ -187,21 +214,16 @@ static void linkwatch_event(struct work_struct *dummy)
void linkwatch_fire_event(struct net_device *dev)
{
- if (!test_and_set_bit(__LINK_STATE_LINKWATCH_PENDING, &dev->state)) {
- unsigned long delay;
+ int urgent = linkwatch_urgent_event(dev);
+ if (!test_and_set_bit(__LINK_STATE_LINKWATCH_PENDING, &dev->state)) {
dev_hold(dev);
linkwatch_add_event(dev);
+ } else if (!urgent)
+ return;
- delay = linkwatch_nextevent - jiffies;
-
- /* Minimise down-time: drop delay for up event. */
- if (linkwatch_urgent_event(dev))
- delay = 0;
-
- linkwatch_schedule_work(delay);
- }
+ linkwatch_schedule_work(urgent);
}
EXPORT_SYMBOL(linkwatch_fire_event);
-
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Powered by blists - more mailing lists