[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <DS7PR84MB3039BEC88FB54C62BD107CF6D70E2@DS7PR84MB3039.NAMPRD84.PROD.OUTLOOK.COM>
Date: Thu, 18 Apr 2024 19:26:51 +0000
From: "Tom, Deepak Abraham" <deepak-abraham.tom@....com>
To: Stephen Hemminger <stephen@...workplumber.org>
CC: "netdev@...r.kernel.org" <netdev@...r.kernel.org>
Subject: RE: 2nd RTM_NEWLINK notification with operstate down is always 1
second delayed
Maybe I'm missing something, but could you please explain how this really helps to not keep FRR busy?
If I understood this right, the link watch code does not ignore events but merely delays them. So any link transition will be propagated whether its scheduled urgently or not urgently.
So FRR will have to still deal with each transition keeping it busy with or without this change, unless FRR dampens flaps on its own?
Also from a design perspective, would it be better if FRR's issues with route flaps be dealt directly in FRR code itself? That way, in use cases where FRR does not come in to play, such a delay is not causing other consequences? Are there more such situations where such a delay is absolutely required?
Thank You,
Deepak Abraham Tom
-----Original Message-----
From: Stephen Hemminger <stephen@...workplumber.org>
Sent: Wednesday, April 17, 2024 4:34 PM
To: Tom, Deepak Abraham <deepak-abraham.tom@....com>
Cc: netdev@...r.kernel.org
Subject: Re: 2nd RTM_NEWLINK notification with operstate down is always 1 second delayed
On Wed, 17 Apr 2024 17:37:40 +0000
"Tom, Deepak Abraham" <deepak-abraham.tom@....com> wrote:
> Hi!
>
> I have a system configured with 2 physical eth interfaces connected to a switch.
> When I reboot the switch, I see that the userspace RTM_NEWLINK notifications for the interfaces are always 1 second apart although both links actually go down almost simultaneously!
> The subsequent RTM_NEWLINK notifications when the switch comes back up are however only delayed by a few microseconds between each other, which is as expected.
>
> Turns out this delay is intentionally introudced by the linux kernel networking code in net/core/link_watch.c, last modified 17 years ago in commit 294cc44:
> /*
> * Limit the number of linkwatch events to one
> * per second so that a runaway driver does not
> * cause a storm of messages on the netlink
> * socket. This limit does not apply to up events
> * while the device qdisc is down.
> */
>
>
> On modern high performance systems, limiting the number of down events to just one per second have far reaching consequences.
> I was wondering if it would be advisable to reduce this delay to something smaller, say 5ms (so 5ms+scheduling delay practically):
The reason is that for systems that are connected to the Internet with routing daemons the impact of link state change is huge. A single link transistion may keep FRR (nee Quagga) busy for a several seconds as it linearly evaluates 3 Million route entries. Maybe more recent versions of FRR got smarter. This is also to avoid routing daemon propagating lots of changes a.k.a route flap.
Powered by blists - more mailing lists