lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <Z_yhVCu0UR5s6p19@pengutronix.de>
Date: Mon, 14 Apr 2025 07:47:00 +0200
From: Oleksij Rempel <o.rempel@...gutronix.de>
To: Stefan Wahren <wahrenst@....net>
Cc: Thangaraj Samynathan <Thangaraj.S@...rochip.com>,
	Rengarajan Sundararajan <Rengarajan.S@...rochip.com>,
	netdev <netdev@...r.kernel.org>,
	"linux-usb@...r.kernel.org" <linux-usb@...r.kernel.org>
Subject: Re: lan78xx: Failed to sync IRQ enable register: -ENODEV

Hi Stefan,

On Sun, Apr 13, 2025 at 09:49:00PM +0200, Stefan Wahren wrote:
> Hi,
> i noticed that recent changes to lan78xx introduced error messages to
> the bootlog of Raspberry Pi 3 B Plus (arm/multi_v7_defconfig, 6.15.0-rc1).
> 
> [    8.715374] lan78xx 1-1.1.1:1.0 (unnamed net_device) (uninitialized):
> No External EEPROM. Setting MAC Speed
> [    9.313859] usbcore: registered new interface driver lan78xx
> [   10.132752] vchiq: module is from the staging directory, the quality
> is unknown, you have been warned.
> [   10.533613] usbcore: registered new device driver onboard-usb-dev
> [   10.533861] usb 1-1.1: USB disconnect, device number 3
> [   10.533880] usb 1-1.1.1: USB disconnect, device number 6
> [   10.656641] lan78xx 1-1.1.1:1.0 eth0 (unregistered): Failed to sync
> IRQ enable register: -ENODEV
> [   10.657440] lan78xx 1-1.1.1:1.0 eth0 (unregistered): Failed to sync
> IRQ enable register: -ENODEV
> [   10.658819] usb 1-1.1.2: USB disconnect, device number 5
> 
> Since this happend during only two times during boot, i added a
> WARN_ON() in this specific case in order to see what's going on:
...
> [   10.656092]  lan78xx_irq_bus_sync_unlock from  free_irq
> [   10.656110]  free_irq from phy_disconnect
> [   10.656131]  phy_disconnect from lan78xx_disconnect
> [   10.656143]  lan78xx_disconnect from usb_unbind_interface
...
> Maybe some has any idea, how to fix this properly.

Thanks for the detailed report and backtrace!

The warning you're seeing was introduced by this patch:
0da202e6a56f ("net: usb: lan78xx: Add error handling to lan78xx_irq_bus_sync_unlock")

It adds error handling to lan78xx_irq_bus_sync_unlock() to log failed
register access.

In your case, everything in the stack is actually doing what it's
supposed to:
- lan78xx_disconnect() notifies the PHY subsystem.
- PHY framework sees the attached IRQ and calls free_irq().
- free_irq() calls irq_chip_bus_sync_unlock() ->
  lan78xx_irq_bus_sync_unlock(), where we hit the -ENODEV because
  the USB device is already gone.

The issue is that the IRQ subsystem doesn’t currently support
hot-unpluggable IRQ controllers, so there's no mechanism to tell it
"the hardware is already gone, just clean up the software state."
Until such a mechanism exists, these benign warnings can show up in
valid disconnect paths.

I can imagine a few possible options:

- Silently ignore -ENODEV in irq_bus_sync_unlock() and similar paths  
  Pro: trivial to implement  
  Contra: completely hides real issues if they happen for other reasons

- Add a global flag to suppress lan78xxx errors in .disconnect path  
  Pro: simple and less intrusive  
  Contra: same as above — poor diagnostics

- Introduce irq_domain_mark_hardware_removed() and check it in relevant paths  
  Pro: makes the real hardware state explicit, allows IRQ framework to make
       better decisions, improves diagnostics  
  Contra: requires some non-trivial changes across IRQ and driver code

Personally, I’d prefer the last option. It's harder to implement, but it gives
us the right model for handling hot-unpluggable IRQ controllers in the long
term.

All of these changes are more or less cosmetic. So far, there is no real
problem — just the fact that software is attempting to access hardware that is
already gone.  My proposal would primarily make the disconnection path cleaner
and less noisy.

It won’t prevent other parts of the driver or subsystem from hitting -ENODEV
before we reach the disconnect path. So until the driver itself is aware that
the hardware is gone and begins cleanup, we may still get register access
errors from other paths. This is expected.

Best regards,  
Oleksij Rempel
-- 
Pengutronix e.K.                           |                             |
Steuerwalder Str. 21                       | http://www.pengutronix.de/  |
31137 Hildesheim, Germany                  | Phone: +49-5121-206917-0    |
Amtsgericht Hildesheim, HRA 2686           | Fax:   +49-5121-206917-5555 |

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ