[<prev] [next>] [thread-next>] [day] [month] [year] [list]
Message-ID: <DS7PR84MB303940368E1CC7CE98A49E96D70F2@DS7PR84MB3039.NAMPRD84.PROD.OUTLOOK.COM>
Date: Wed, 17 Apr 2024 17:37:40 +0000
From: "Tom, Deepak Abraham" <deepak-abraham.tom@....com>
To: "netdev@...r.kernel.org" <netdev@...r.kernel.org>
Subject: 2nd RTM_NEWLINK notification with operstate down is always 1 second
delayed
Hi!
I have a system configured with 2 physical eth interfaces connected to a switch.
When I reboot the switch, I see that the userspace RTM_NEWLINK notifications for the interfaces are always 1 second apart although both links actually go down almost simultaneously!
The subsequent RTM_NEWLINK notifications when the switch comes back up are however only delayed by a few microseconds between each other, which is as expected.
Turns out this delay is intentionally introudced by the linux kernel networking code in net/core/link_watch.c, last modified 17 years ago in commit 294cc44:
/*
* Limit the number of linkwatch events to one
* per second so that a runaway driver does not
* cause a storm of messages on the netlink
* socket. This limit does not apply to up events
* while the device qdisc is down.
*/
On modern high performance systems, limiting the number of down events to just one per second have far reaching consequences.
I was wondering if it would be advisable to reduce this delay to something smaller, say 5ms (so 5ms+scheduling delay practically):
--- a/net/core/link_watch.c
+++ b/net/core/link_watch.c
@@ -130,8 +130,8 @@ static void linkwatch_schedule_work(int urgent)
delay = 0;
}
- /* If we wrap around we'll delay it by at most HZ. */
- if (delay > HZ)
+ /* If we wrap around we'll delay it by at most HZ/200. */
+ if (delay > (HZ/200))
delay = 0;
/*
@@ -187,15 +187,15 @@ static void __linkwatch_run_queue(int urgent_only)
/*
* Limit the number of linkwatch events to one
- * per second so that a runaway driver does not
+ * per 5 millisecond so that a runaway driver does not
* cause a storm of messages on the netlink
* socket. This limit does not apply to up events
* while the device qdisc is down.
*/
if (!urgent_only)
- linkwatch_nextevent = jiffies + HZ;
+ linkwatch_nextevent = jiffies + (HZ/200);
/* Limit wrap-around effect on delay. */
- else if (time_after(linkwatch_nextevent, jiffies + HZ))
+ else if (time_after(linkwatch_nextevent, jiffies + (HZ/200)))
linkwatch_nextevent = jiffies;
clear_bit(LW_URGENT, &linkwatch_flags);
I have tested this change in my environment, and it works as expected. I don't see any new issues popping up because of this.
Are there any concerns with making this change today? Hoping to get some feedback.
Thank You,
Deepak Abraham Tom
Powered by blists - more mailing lists