lists.openwall.net | lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC | |
Open Source and information security mailing list archives
| ||
|
Date: Thu, 6 Aug 2020 17:09:00 +0800 From: Wen Yang <wenyang@...ux.alibaba.com> To: David Miller <davem@...emloft.net> Cc: kuba@...nel.org, xlpang@...ux.alibaba.com, caspar@...ux.alibaba.com, andrew@...n.ch, edumazet@...gle.com, jiri@...lanox.com, leon@...nel.org, jwi@...ux.ibm.com, netdev@...r.kernel.org, linux-kernel@...r.kernel.org Subject: Re: [PATCH] net: core: explicitly call linkwatch_fire_event to speed up the startup of network services 在 2020/8/5 上午6:58, David Miller 写道: > From: Wen Yang <wenyang@...ux.alibaba.com> > Date: Sat, 1 Aug 2020 16:58:45 +0800 > >> diff --git a/net/core/link_watch.c b/net/core/link_watch.c >> index 75431ca..6b9d44b 100644 >> --- a/net/core/link_watch.c >> +++ b/net/core/link_watch.c >> @@ -98,6 +98,9 @@ static bool linkwatch_urgent_event(struct net_device *dev) >> if (netif_is_lag_port(dev) || netif_is_lag_master(dev)) >> return true; >> >> + if ((dev->flags & IFF_UP) && dev->operstate == IF_OPER_DOWN) >> + return true; >> + >> return netif_carrier_ok(dev) && qdisc_tx_changing(dev); >> } >> > > You're bypassing explicitly the logic here: > > /* > * Limit the number of linkwatch events to one > * per second so that a runaway driver does not > * cause a storm of messages on the netlink > * socket. This limit does not apply to up events > * while the device qdisc is down. > */ > if (!urgent_only) > linkwatch_nextevent = jiffies + HZ; > /* Limit wrap-around effect on delay. */ > else if (time_after(linkwatch_nextevent, jiffies + HZ)) > linkwatch_nextevent = jiffies; > > Something about this isn't right. We need to analyze what you are seeing, > what device you are using, and what systemd is doing to figure out what > the right place for the fix. > > Thank you. > Thank you very much for your comments. We are using virtio_net and the environment is a microvm similar to firecracker. Let's briefly explain. net_device->operstate is assigned through linkwatch_event, and the call stack is as follows: process_one_work -> linkwatch_event -> __linkwatch_run_queue -> linkwatch_do_dev -> rfc2863_policy -> default_operstate During the machine startup process, net_device->operstate has the following two-step state changes: STEP A: virtnet_probe detects the network card and triggers the execution of linkwatch_fire_event. Since linkwatch_nextevent is initialized to 0, linkwatch_work will run. And since net_device->state is 6 (__LINK_STATE_PRESENT | __LINK_STATE_NOCARRIER), net_device->operstate will be changed from IF_OPER_UNKNOWN to IF_OPER_DOWN: eth0 operstate:0 (IF_OPER_UNKNOWN) -> operstate:2 (IF_OPER_DOWN) virtnet_probe then executes netif_carrier_on to update net_device->state, it will be changed from ‘__LINK_STATE_PRESENT | __LINK_STATE_NOCARRIER’ to __LINK_STATE_PRESENT: eth0 state: 6 (__LINK_STATE_PRESENT | __LINK_STATE_NOCARRIER) -> 2 (__LINK_STATE_PRESENT) STEP B: One second later (because linkwatch_nextevent = jiffies + HZ), linkwatch_work is executed again. At this time, since net_device->state is __LINK_STATE_PRESENT, so the net_device->operstate will be changed from IF_OPER_DOWN to IF_OPER_UP: eth0 operstate:2 (IF_OPER_DOWN) -> operstate:6 (IF_OPER_UP) The above state change can be completed within 2 seconds. Generally, the machine will load the initramfs first, and do some initialization in the initramfs, which takes some time; then switch_root to the system disk and continue the initialization, which will also take some time, and finally start the systemd-networkd service, bringing link, etc., In this way, the linkwatch_work work queue has enough time to run twice, and the state of net_device->operstate is already IF_OPER_UP, So bringing link up quickly returns the following information: Aug 06 16:35:55.966121 iZuf6h1kfgutxc3el68z2lZ systemd-networkd[580]: eth0: bringing link up ... Aug 06 16:35:55.990461 iZuf6h1kfgutxc3el68z2lZ systemd-networkd[580]: eth0: flags change: +UP +LOWER_UP +RUNNING But we are now using MicroVM, which requires extreme speed to start, bypassing the initramfs and directly booting the trimmed system on the disk. systemd-networkd starts in less than 1 second after booting. the STEP B has not been run yet, so it will wait for several hundred milliseconds here, as follows: Jul 20 22:00:47.432552 systemd-networkd[210]: eth0: bringing link up ... Jul 20 22:00:47.446108 systemd-networkd[210]: eth0: flags change: +UP +LOWER_UP ... Jul 20 22:00:47.781463 systemd-networkd[210]: eth0: flags change: +RUNNING Note: dhcp pays attention to IFF_RUNNING status, we may refer to: https://www.kernel.org/doc/Documentation/networking/operstates.txt A routing daemon or dhcp client just needs to care for IFF_RUNNING or waiting for operstate to go IF_OPER_UP/IF_OPER_UNKNOWN before considering the interface / querying a DHCP address. Finally, the STEP B above only updates the value of operstate based on the known state (operstate/state) on the net_device, without any hardware interaction involved, so it is not very reasonable to wait for 1 second there. By adding: + if ((dev->flags & IFF_UP) && dev->operstate == IF_OPER_DOWN) + return true; + We hope to improve the linkwatch_urgent_event function a bit. Hope to get more of your advice and guidance. Best wishes, Wen
Powered by blists - more mailing lists