lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Date:   Tue, 12 Feb 2019 07:51:05 +0100
From:   Heiner Kallweit <hkallweit1@...il.com>
To:     Andrew Lunn <andrew@...n.ch>,
        John David Anglin <dave.anglin@...l.net>
Cc:     Russell King <linux@....linux.org.uk>,
        Vivien Didelot <vivien.didelot@...oirfairelinux.com>,
        Florian Fainelli <f.fainelli@...il.com>, netdev@...r.kernel.org
Subject: Re: [PATCH net] dsa: mv88e6xxx: Ensure all pending interrupts are
 handled prior to exit

On 12.02.2019 04:58, Andrew Lunn wrote:
>>> Hi David
>>>
>>> I just tested this on one of my boards. It loops endlessly:
>>>
>>> [   47.173396] mv88e6xxx_g1_irq_thread_work: c881 a8 80                         
>>> [   47.182108] mv88e6xxx_g1_irq_thread_work: c881 a8 80                         
>>> [   47.190820] mv88e6xxx_g1_irq_thread_work: c881 a8 80                         
>>> [   47.199535] mv88e6xxx_g1_irq_thread_work: c881 a8 80                         
>>> [   47.208254] mv88e6xxx_g1_irq_thread_work: c881 a8 80   
>>>
>>> These are reg, ctl1, reg & ctl1.
>>>
>>> So there is an unhandled device interrupt.
> 
> Hi Heiner
> 
> Your patch Fixes: 2b3e88ea6528 ("net: phy: improve phy state
> checking") is causing me problems with interrupts for the Marvell
> switches.
> 
Hi Andrew,

what kernel version is it?
And the PHY driver in use is "Marvell 88E6390" ?

> That change means we don't check the PHY device if it caused an
> interrupt when its state is less than UP.
> 
> What i'm seeing is that the PHY is interrupting pretty early on after
> a reboot when the previous boot had the interface up.
> 
So this means that when going down for reboot the interrupts are not
properly masked / disabled? Because (at least for net-next) we enable
interrupts in phy_start() only.


> [   10.125702] Marvell 88E6390 mv88e6xxx-0:02: phy_start_interrupts
> [   10.162798] Marvell 88E6390 mv88e6xxx-0:02: phy_enable_interrupts
> [   10.168931] Marvell 88E6390 mv88e6xxx-0:02: marvell_ack_interrupt
> [   10.180164] Marvell 88E6390 mv88e6xxx-0:02: marvell_config_intr 1
> 
> a little later it interrupts:
> 
> [   12.999717] mv88e6xxx_g1_irq_thread_fn
> [   13.007253] mv88e6xxx_g2_irq_thread_fn: 4 811c 4
> [   13.012015] libphy: __phy_is_started: phydev->state 1 PHY_UP 3
> [   13.017941] Marvell 88E6390 mv88e6xxx-0:02: phy_interrupt: phy_is_started(phydev) 0
> 
> The current code just causes it to be ignored. So the interrupts fires
> again, and again...
> 
I would have more expected the opposite. If the interrupt is ignored
(IRQ_NONE returned), then it doesn't get acked. And if it's not acked
new interrupts should be blocked. Or is it different with this chip?

> If i change to code to call into the PHY driver and let it handle the
> interrupts, things keep running. A little bit later the interface is
> configured up:
> 
> [   15.921326] mv88e6085 gpio-0:00 red: configuring for phy/gmii link mode
> [   15.928693] libphy: __phy_is_started: phydev->state 3 PHY_UP 3
> [   15.929442] IPv6: ADDRCONF(NETDEV_UP): red: link is not ready
> [   15.935596] Marvell 88E6390 mv88e6xxx-0:02: m88e6390_config_aneg
> [   15.935608] Marvell 88E6390 mv88e6xxx-0:02: m88e6390_errata
> 
> [   16.071364] Marvell 88E6390 mv88e6xxx-0:02: m88e1510_config_aneg
> [   16.112362] Marvell 88E6390 mv88e6xxx-0:02: m88e1318_config_aneg
> [   16.151245] Marvell 88E6390 mv88e6xxx-0:02: m88e1121_config_aneg
> [   16.368206] Marvell 88E6390 mv88e6xxx-0:02: PHY state change UP -> NOLINK
> 
> and after another interrupt the link goes up.
> 
> [   19.519840] mv88e6xxx_g1_irq_thread_fn
> [   19.528546] mv88e6xxx_g2_irq_thread_fn: 4 811c 4
> [   19.534152] libphy: __phy_is_started: phydev->state 5 PHY_UP 3
> [   19.540030] Marvell 88E6390 mv88e6xxx-0:02: phy_interrupt: phy_is_started(phydev) 1
> [   19.547721] Marvell 88E6390 mv88e6xxx-0:02: m88e1121_did_interrupt
> [   19.559829] Marvell 88E6390 mv88e6xxx-0:02: marvell_ack_interrupt
> [   19.590753] Marvell 88E6390 mv88e6xxx-0:02: marvell_read_status
> [   19.596712] Marvell 88E6390 mv88e6xxx-0:02: marvell_update_link
> [   19.628387] Marvell 88E6390 mv88e6xxx-0:02: PHY state change NOLINK -> RUNNING
> [   19.628453] mv88e6085 gpio-0:00 red: Link is Up - 1Gbps/Full - flow control off
> [   19.635920] IPv6: ADDRCONF(NETDEV_CHANGE): red: link becomes ready
> 
> I don't yet know why the first interrupt happens, before we configure
> auto-neg, etc. But it is not too unreasonable. We have configured
> interrupts, so it could be reporting link down etc.
> 
> So i think we might need to revert part of this change, call into the
> driver so long as the PHY is not in state PHY_HALTED.
> 
> What do you think?
> 
I will take a closer look later.

>      Andrew
> 
Heiner

Powered by blists - more mailing lists