lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Date: Tue, 8 Aug 2023 23:56:56 +0100
From: "Russell King (Oracle)" <linux@...linux.org.uk>
To: Florian Fainelli <f.fainelli@...il.com>
Cc: Jakub Kicinski <kuba@...nel.org>, Andrew Lunn <andrew@...n.ch>,
	Heiner Kallweit <hkallweit1@...il.com>,
	Uwe Kleine-König <u.kleine-koenig@...gutronix.de>,
	Ioana Ciornei <ciorneiioana@...il.com>, netdev@...r.kernel.org,
	linux-kernel@...r.kernel.org, Ioana Ciornei <ioana.ciornei@....com>,
	Alexandru Ardelean <alexandru.ardelean@...log.com>,
	Andre Edich <andre.edich@...rochip.com>,
	Antoine Tenart <atenart@...nel.org>,
	Baruch Siach <baruch@...s.co.il>,
	Christophe Leroy <christophe.leroy@....fr>,
	Divya Koppera <Divya.Koppera@...rochip.com>,
	Hauke Mehrtens <hauke@...ke-m.de>,
	Jerome Brunet <jbrunet@...libre.com>,
	Kavya Sree Kotagiri <kavyasree.kotagiri@...rochip.com>,
	Linus Walleij <linus.walleij@...aro.org>,
	Marco Felsch <m.felsch@...gutronix.de>, Marek Vasut <marex@...x.de>,
	Martin Blumenstingl <martin.blumenstingl@...glemail.com>,
	Mathias Kresin <dev@...sin.me>, Maxim Kochetkov <fido_max@...ox.ru>,
	Michael Walle <michael@...le.cc>,
	Neil Armstrong <narmstrong@...libre.com>,
	Nisar Sayed <Nisar.Sayed@...rochip.com>,
	Oleksij Rempel <o.rempel@...gutronix.de>,
	Philippe Schenker <philippe.schenker@...adex.com>,
	Willy Liu <willy.liu@...ltek.com>,
	Yuiko Oshino <yuiko.oshino@...rochip.com>
Subject: Re: [PATCH] net: phy: Don't disable irqs on shutdown if WoL is
 enabled

On Tue, Aug 08, 2023 at 02:59:41PM -0700, Florian Fainelli wrote:
> On 8/8/23 14:53, Jakub Kicinski wrote:
> > On Fri,  4 Aug 2023 09:17:57 +0200 Uwe Kleine-König wrote:
> > > Most PHYs signal WoL using an interrupt. So disabling interrupts breaks
> > > WoL at least on PHYs covered by the marvell driver. So skip disabling
> > > irqs on shutdown if WoL is enabled.
> > > 
> > > While at it also explain the motivation that irqs are disabled at all.
> > > 
> > > Fixes: e2f016cf7751 ("net: phy: add a shutdown procedure")
> > > Signed-off-by: Uwe Kleine-König <u.kleine-koenig@...gutronix.de>
> > 
> > What do we do with this one? It sounded like Russell was leaning
> > towards a revert?
> 
> Yes, though I believe this will create a different kind of regression for
> what Iona was addressing initially. Then it becomes a choice of which
> regression do we consider to be the worst to handle until something better
> comes up.
> 
> Russell what are your thoughts?

In this situation where a fix for a problem is provided which then
causes a regression by fixing that problem, I've seen Linus T state
that it means the fix was incorrect. That seems entirely sensible.

We are, of course, in the situation where reverting the commit
restores the old behaviour and thus fixes a regression, but causes
a regression for another user.

If it is possible to quickly come up with a fix that avoids any
regression to either use case, then that is obviously preferable.
However, if that's not possible, then it seems going back to the
original situation (i.o.w. reverting) is sensible.

Now, the fact is that many PHYs do use their interrupts to signal
that a wake-up happened, and disabling the IRQ from the PHY will
prevent WoL from working. Other PHYs have a separate pin for this.
Two recent examples are AR8035, which only has a single interrupt
pin which covers all interrupts from that PHY, and AR8031 or AR8033
which have a separate WOL_INT pin which might be used - or the
main interrupt pin.

If we hibernate the system, then people how have configured WoL
are going to expect it to work - but disabling the ability for
the PHY to raise an interrupt will prevent it.

So, clearly always disabling PHY interrupts can have a detrimental
effect on the ability to wake a system up using WoL - where the PHY
interrupt is used to signal WoL to the rest of the system.

Now, if waking the system up from hibernation using WoL involves
the PHY asserting its interrupt pin, then the system must be
capable of dealing with the PHY asserting its interrupt while the
system is booting. Remember that the way Linux hibernation works,
that boot is just the same as a regular boot right up through the
normal kernel initialisation. It is only towards the end that the
kernel detects the signature in swap space, and then does the
funky stuff to resume from the saved data.

So, during that boot, the system has to cope with that interrupt
having been asserted by the PHY hardware. Either system firmware
has to recognise that was the wake-up event and deal with it (e.g.
disabling the interrupt source) before passing control to the
kernel, or the kernel has to be able to cope with that interrupt
being stuck at active state until the PHY driver can deal with it.

Obviously, if WoL is not enabled or supported, then disabling the
PHY interrupt should be harmless - but that will have the effect
of masking any issues that a platform may have until PHY based
WoL has been enabled.

Also, don't forget that we have this kexec thing - and the
.shutdown methods will be called just before handing control to
the new kernel.

Uwe's patch solves the problem that he's experiencing - because
it makes the interrupt disabling dependent on the WoL configuration.

However, Ioana's problem would still remain - and enabling WoL on
that platform will make it reappear - and thus it still needs to be
properly fixed.

If that problem is properly fixed, then we don't need to disable
PHY interrupts, which means a revert would be the right approach.

Honestly, I don't know what would be best - and I don't believe we've
heard from Ioana about the problem that was trying to be addressed
(e.g. exactly when it happened and why.)

If I had to guess:
- the PHY in question may be sharing an interrupt with another device
- when that other device probes and claims its interrupt, an interrupt
  storm ensues
- the interrupt layer disables the interrupt input, rendering both the
  PHY and other device unusable.

I think I've covered all the possibilities, all the issues, outcomes,
and the politics as far as Linus T would state. I'm also quite sure
that there will be no way to satisfy everyone!

Bearing in mind that it is holiday season, and we're at -rc5, I
think we should give Ioana a bit more time to respond before we
make a decision. Maybe a little over a week? If we don't hear anything,
then I think following our established policy and reverting would be
the correct way forward.

-- 
RMK's Patch system: https://www.armlinux.org.uk/developer/patches/
FTTP is here! 80Mbps down 10Mbps up. Decent connectivity at last!

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ