lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [thread-next>] [day] [month] [year] [list]
Message-ID: <DS7PR84MB303940368E1CC7CE98A49E96D70F2@DS7PR84MB3039.NAMPRD84.PROD.OUTLOOK.COM>
Date: Wed, 17 Apr 2024 17:37:40 +0000
From: "Tom, Deepak Abraham" <deepak-abraham.tom@....com>
To: "netdev@...r.kernel.org" <netdev@...r.kernel.org>
Subject: 2nd RTM_NEWLINK notification with operstate down is always 1 second
 delayed

Hi!

I have a system configured with 2 physical eth interfaces connected to a switch.
When I reboot the switch, I see that the userspace RTM_NEWLINK notifications for the interfaces are always 1 second apart although both links actually go down almost simultaneously!
The subsequent RTM_NEWLINK notifications when the switch comes back up are however only delayed by a few microseconds between each other, which is as expected.

Turns out this delay is intentionally introudced by the linux kernel networking code in net/core/link_watch.c, last modified 17 years ago in commit 294cc44:
         /*
          * Limit the number of linkwatch events to one
          * per second so that a runaway driver does not
          * cause a storm of messages on the netlink
          * socket.  This limit does not apply to up events
          * while the device qdisc is down.
          */


On modern high performance systems, limiting the number of down events to just one per second have far reaching consequences.
I was wondering if it would be advisable to reduce this delay to something smaller, say 5ms (so 5ms+scheduling delay practically):
--- a/net/core/link_watch.c
+++ b/net/core/link_watch.c
@@ -130,8 +130,8 @@ static void linkwatch_schedule_work(int urgent)
                delay = 0;
        }

-       /* If we wrap around we'll delay it by at most HZ. */
-       if (delay > HZ)
+       /* If we wrap around we'll delay it by at most HZ/200. */
+       if (delay > (HZ/200))
                delay = 0;

        /*
@@ -187,15 +187,15 @@ static void __linkwatch_run_queue(int urgent_only)

        /*
         * Limit the number of linkwatch events to one
-        * per second so that a runaway driver does not
+        * per 5 millisecond so that a runaway driver does not
         * cause a storm of messages on the netlink
         * socket.  This limit does not apply to up events
         * while the device qdisc is down.
         */
        if (!urgent_only)
-               linkwatch_nextevent = jiffies + HZ;
+               linkwatch_nextevent = jiffies + (HZ/200);
        /* Limit wrap-around effect on delay. */
-       else if (time_after(linkwatch_nextevent, jiffies + HZ))
+       else if (time_after(linkwatch_nextevent, jiffies + (HZ/200)))
                linkwatch_nextevent = jiffies;

        clear_bit(LW_URGENT, &linkwatch_flags);


I have tested this change in my environment, and it works as expected. I don't see any new issues popping up because of this.

Are there any concerns with making this change today? Hoping to get some feedback.


Thank You,
Deepak Abraham Tom

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ