[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Date: Fri, 05 May 2023 12:16:48 +0200
From: Bjørn Mork <bjorn@...k.no>
To: "Linux regression tracking (Thorsten Leemhuis)" <regressions@...mhuis.info>
Cc: Hayes Wang <hayeswang@...ltek.com>,
Linux regressions mailing list <regressions@...ts.linux.dev>,
netdev@...r.kernel.org, Paolo Abeni <pabeni@...hat.com>,
Jakub Kicinski <kuba@...nel.org>, Eric Dumazet <edumazet@...gle.com>,
"David S. Miller" <davem@...emloft.net>
Subject: Re: [regression] Kernel OOPS on boot with Kernel 6.3(.1) and
RTL8153 Gigabit Ethernet Adapter
"Linux regression tracking (Thorsten Leemhuis)"
<regressions@...mhuis.info> writes:
>> Kernel OOPS on boot
>>
>> Hello,
>>
>> on my laptop with kernel 6.3.0 and 6.3.1 fails to correctly boot if the usb-c device "RTL8153 Gigabit Ethernet Adapter" is connected.
>>
>> If I unplug it, boot and the plug it in, everything works fine.
>>
>> This used to work fine with 6.2.10.
>>
>> HW:
>> - Dell Inc. Latitude 7410/0M5G57, BIOS 1.22.0 03/20/2023
>> - Realtek Semiconductor Corp. RTL8153 Gigabit Ethernet Adapter
>>
>>
>> Call Trace (manually typed from the image, typos maybe be included)
>> - bpf_dev_bound_netdev_unregister
>> - unregister_netdevice_many_notify
>> - unregister_netdevice_gueue
>> - unregister_netdev
>> - usbnet_disconnect
>> - usb_unbind_interface
>> - device_release_driver_internal
>> - bus_remove_device
>> - device_del
>> - ? kobject_put
>> - usb_disable_device
>> - usb_set_configuration
>> - rt18152_cfgselector_probe
>> - usb_probe_device
>> - really_probe
>> - ? driver_probe_device
>> - ...
Ouch. This is obviously related to the change I made to the RTL8153
driver, which you can see is in effect by the call to
rtl8152_cfgselector_probe above (compensating for the typo).
But to me it doesn't look like the bug is in that driver. It seems we
are triggering some latent bug in the unregister_netdev code?
The trace looks precise enogh to me. The image also shows
RIP: 0010: __rhastable_lookup.constprop.0+0x18/0x120
which I believe comes from bpf_dev_bound_netdev_unregister() calling the
bpf_offload_find_netdev(), which does:
bpf_offload_find_netdev(struct net_device *netdev)
{
lockdep_assert_held(&bpf_devs_lock);
return rhashtable_lookup_fast(&offdevs, &netdev, offdevs_params);
}
Maybe someone familiar with that code can explain why this fails if called
at boot instead of later?
AFAICS, we don't do anything out of the ordinary in that driver, with
respect to netdev registration at least. A similar device disconnet and
netdev unregister would also happen if you decided to pull the USB
device from the port during boot. In fact, most USB network devices
behave similar when disconnected and there is nothing preventing it
from happening while the system is booting..
Bjørn
Powered by blists - more mailing lists