lists.openwall.net | lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening PHC | |
Open Source and information security mailing list archives
| ||
|
Date: Thu, 9 Nov 2017 07:51:14 -0800 From: Girish Moodalbail <girish.moodalbail@...cle.com> To: Cong Wang <xiyou.wangcong@...il.com>, Fengguang Wu <fengguang.wu@...el.com> Cc: Alexander Duyck <alexander.duyck@...il.com>, Linus Torvalds <torvalds@...ux-foundation.org>, Jeff Kirsher <jeffrey.t.kirsher@...el.com>, Network Development <netdev@...r.kernel.org>, "David S. Miller" <davem@...emloft.net>, Linux Kernel Mailing List <linux-kernel@...r.kernel.org>, intel-wired-lan <intel-wired-lan@...ts.osuosl.org> Subject: Re: [vlan_device_event] BUG: unable to handle kernel paging request at 6b6b6ccf On 11/8/17 10:34 PM, Cong Wang wrote: > On Wed, Nov 8, 2017 at 7:12 PM, Fengguang Wu <fengguang.wu@...el.com> wrote: >> Hi Alex, >> >>> So looking over the trace the panic seems to be happening after a >>> decnet interface is getting deleted. Is there any chance we could try >>> compiling the kernel without decnet support to see if that is the >>> source of these issues? I don't know if anyone on the Intel Wired Lan >>> team is testing with that enabled so if we can eliminate that as a >>> possible cause that would be useful. >> >> >> Sure and thank you for the suggestion! >> >> It looks disabling DECNET still triggers the vlan_device_event BUG. >> However when looking at the dmesgs, I find another warning just before >> the vlan_device_event BUG. Not sure if it's related one or independent >> now-fixed issue. > > Those decnet symbols are probably noises. Right. This is a 32-bit Kernel compiled with CONFIG_PREEMPT=y (I am guessing that this has exposed some lock bug). Also, VLAN (8021q) is compiled into the kernel, so it registers a vlan_device_event() callback on boot. There may not be a VLAN device per-se. Upon receiving NETDEV_DOWN event, we are calling vlan_vid_del(dev, htons(ETH_P_8021Q), 0); which in turn calls call_rcu() to queue vlan_info_free_rcu() to be called at some point. This free function frees the array[] (vlan_info.vlan_grp.vn_devices_array). My guess is that vlan_info_free_rcu() is being called first and then the array[] is being accessed in vlan_device_event(). The netifd daemon in OpenWRT is marking the interface down and that is why it is generating NETDEV_DOWN event. And it uses ioctl(SIOCSIFFLAGS, ~IFF_UP) on a AF_UNIX socket. This results in a call to dev_ifsioc() in the kernel with only rtnl_lock() held and it is not in RCU read critical section. ~Girish > > How do you reproduce it? And what is your setup? Vlan device on > top of your eth0 (e1000)? >
Powered by blists - more mailing lists