lists.openwall.net | lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC | |
Open Source and information security mailing list archives
| ||
|
Message-ID: <CANn89iKLSBwCnzS8TPSbkH+v_gMobFotOdCbdSMxAkhtx54xQA@mail.gmail.com> Date: Wed, 10 May 2023 11:40:39 +0200 From: Eric Dumazet <edumazet@...gle.com> To: Martin Zaharinov <micron10@...il.com> Cc: Ido Schimmel <idosch@...sch.org>, netdev <netdev@...r.kernel.org> Subject: Re: Very slow remove interface from kernel On Wed, May 10, 2023 at 8:06 AM Martin Zaharinov <micron10@...il.com> wrote: > > I think problem is in this part of code in net/core/dev.c What makes you think this ? msleep() is not called a single time on my test bed. # perf probe -a msleep # cat bench.sh modprobe dummy 2>/dev/null ip link set dev dummy0 up 2>/dev/null for i in $(seq 2 4094); do ip link add link dummy0 name vlan$i type vlan id $i; done for i in $(seq 2 4094); do ip link set dev vlan$i up; done time for i in $(seq 2 4094); do ip link del link dummy0 name vlan$i type vlan id $i; done # perf record -e probe:msleep -a -g ./bench.sh real 0m59.877s user 0m0.588s sys 0m7.023s [ perf record: Woken up 6 times to write data ] [ perf record: Captured and wrote 8.561 MB perf.data ] # perf script # << empty, nothing >> > #define WAIT_REFS_MIN_MSECS 1 > #define WAIT_REFS_MAX_MSECS 250 > /** > * netdev_wait_allrefs_any - wait until all references are gone. > * @list: list of net_devices to wait on > * > * This is called when unregistering network devices. > * > * Any protocol or device that holds a reference should register > * for netdevice notification, and cleanup and put back the > * reference if they receive an UNREGISTER event. > * We can get stuck here if buggy protocols don't correctly > * call dev_put. > */ > static struct net_device *netdev_wait_allrefs_any(struct list_head *list) > { > unsigned long rebroadcast_time, warning_time; > struct net_device *dev; > int wait = 0; > > rebroadcast_time = warning_time = jiffies; > > list_for_each_entry(dev, list, todo_list) > if (netdev_refcnt_read(dev) == 1) > return dev; > > while (true) { > if (time_after(jiffies, rebroadcast_time + 1 * HZ)) { > rtnl_lock(); > > /* Rebroadcast unregister notification */ > list_for_each_entry(dev, list, todo_list) > call_netdevice_notifiers(NETDEV_UNREGISTER, dev); > > __rtnl_unlock(); > rcu_barrier(); > rtnl_lock(); > > list_for_each_entry(dev, list, todo_list) > if (test_bit(__LINK_STATE_LINKWATCH_PENDING, > &dev->state)) { > /* We must not have linkwatch events > * pending on unregister. If this > * happens, we simply run the queue > * unscheduled, resulting in a noop > * for this device. > */ > linkwatch_run_queue(); > break; > } > > __rtnl_unlock(); > > rebroadcast_time = jiffies; > } > > if (!wait) { > rcu_barrier(); > wait = WAIT_REFS_MIN_MSECS; > } else { > msleep(wait); > wait = min(wait << 1, WAIT_REFS_MAX_MSECS); > } > > list_for_each_entry(dev, list, todo_list) > if (netdev_refcnt_read(dev) == 1) > return dev; > > if (time_after(jiffies, warning_time + > READ_ONCE(netdev_unregister_timeout_secs) * HZ)) { > list_for_each_entry(dev, list, todo_list) { > pr_emerg("unregister_netdevice: waiting for %s to become free. Usage count = %d\n", > dev->name, netdev_refcnt_read(dev)); > ref_tracker_dir_print(&dev->refcnt_tracker, 10); > } > > warning_time = jiffies; > } > } > } > > > > m. > > > > On 9 May 2023, at 23:08, Ido Schimmel <idosch@...sch.org> wrote: > > > > On Tue, May 09, 2023 at 09:50:18PM +0300, Martin Zaharinov wrote: > >> i try on kernel 6.3.1 > >> > >> > >> time for i in $(seq 2 4094); do ip link del link eth1 name vlan$i type vlan id $i; done > >> > >> real 4m51.633s —— here i stop with Ctrl + C - and rerun and second part finish after 3 min > >> user 0m7.479s > >> sys 0m0.367s > > > > You are off-CPU most of the time, the question is what is blocking. I'm > > getting the following results with net-next: > > > > # time -p for i in $(seq 2 4094); do ip link del dev eth0.$i; done > > real 177.09 > > user 3.85 > > sys 31.26 > > > > When using a batch file to perform the deletion: > > > > # time -p ip -b vlan_del.batch > > real 35.25 > > user 0.02 > > sys 3.61 > > > > And to check where we are blocked most of the time while using the batch > > file: > > > > # ../bcc/libbpf-tools/offcputime -p `pgrep -nx ip` > > [...] > > __schedule > > schedule > > schedule_timeout > > wait_for_completion > > rcu_barrier > > netdev_run_todo > > rtnetlink_rcv_msg > > netlink_rcv_skb > > netlink_unicast > > netlink_sendmsg > > ____sys_sendmsg > > ___sys_sendmsg > > __sys_sendmsg > > do_syscall_64 > > entry_SYSCALL_64_after_hwframe > > - ip (3660) > > 25089479 > > [...] > > > > We are blocked for around 70% of the time on the rcu_barrier() in > > netdev_run_todo(). > > > > Note that one big difference between my setup and yours is that in my > > case eth0 is a dummy device and in your case it's probably a physical > > device that actually implements netdev_ops::ndo_vlan_rx_kill_vid(). If > > so, it's possible that a non-negligible amount of time is spent talking > > to hardware/firmware to delete the 4K VIDs from the device's VLAN > > filter. > > > >> > >> > >> Config is very clean i remove big part of CONFIG options . > >> > >> is there options to debug what is happen. > >> > >> m >
Powered by blists - more mailing lists