linux-kernel - Re: [PATCH 4/5] net: macb: WoL support for GEM type of Ethernet controller

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <8fcf4a8a-362c-a71f-c99e-be9500db7371@microchip.com>
Date:   Tue, 21 Apr 2020 08:21:48 +0000
From:   <Nicolas.Ferre@...rochip.com>
To:     <linux-arm-kernel@...ts.infradead.org>, <netdev@...r.kernel.org>,
        <Claudiu.Beznea@...rochip.com>, <harini.katakam@...inx.com>,
        <f.fainelli@...il.com>, <linux@...linux.org.uk>
CC:     <andrew@...n.ch>, <alexandre.belloni@...tlin.com>,
        <sergio.prado@...abworks.com>, <pthombar@...ence.com>,
        <antoine.tenart@...tlin.com>, <linux-kernel@...r.kernel.org>,
        <michal.simek@...inx.com>, <davem@...emloft.net>
Subject: Re: [PATCH 4/5] net: macb: WoL support for GEM type of Ethernet
 controller

On 17/04/2020 at 19:14, Nicolas Ferre wrote:
> On 16/04/2020 at 19:44, nicolas.ferre@...rochip.com wrote:
>> From: Nicolas Ferre <nicolas.ferre@...rochip.com>
>>
>> Adapt the Wake-on-Lan feature to the Cadence GEM Ethernet controller.
>> This controller has different register layout and cannot be handled by
>> previous code.
>> We disable completely interrupts on all the queues but the queue 0.
>> Handling of WoL interrupt is done in another interrupt handler
>> positioned depending on the controller version used, just between
>> suspend() and resume() calls.
>> It allows to lower pressure on the generic interrupt hot path by
>> removing the need to handle 2 tests for each IRQ: the first figuring out
>> the controller revision, the second for actually knowing if the WoL bit
>> is set.
>>
>> Queue management in suspend()/resume() functions inspired from RFC patch
>> by Harini Katakam <harinik@...inx.com>, thanks!
>>
>> Signed-off-by: Nicolas Ferre <nicolas.ferre@...rochip.com>
>> ---
>>    drivers/net/ethernet/cadence/macb.h      |   3 +
>>    drivers/net/ethernet/cadence/macb_main.c | 121 ++++++++++++++++++++---
>>    2 files changed, 109 insertions(+), 15 deletions(-)
> 
> [..]
> 
>> @@ -4534,23 +4564,56 @@ static int __maybe_unused macb_suspend(struct device *dev)
>>    	struct macb_queue *queue = bp->queues;
>>    	unsigned long flags;
>>    	unsigned int q;
>> +	int err;
>>    
>>    	if (!netif_running(netdev))
>>    		return 0;
>>    
>>    	if (bp->wol & MACB_WOL_ENABLED) {
>> -		macb_writel(bp, IER, MACB_BIT(WOL));
>> -		macb_writel(bp, WOL, MACB_BIT(MAG));
>> -		enable_irq_wake(bp->queues[0].irq);
>> -		netif_device_detach(netdev);
>> -	} else {
>> -		netif_device_detach(netdev);
>> +		spin_lock_irqsave(&bp->lock, flags);
>> +		/* Flush all status bits */
>> +		macb_writel(bp, TSR, -1);
>> +		macb_writel(bp, RSR, -1);
>>    		for (q = 0, queue = bp->queues; q < bp->num_queues;
>> -		     ++q, ++queue)
>> -			napi_disable(&queue->napi);
>> -		rtnl_lock();
>> -		phylink_stop(bp->phylink);
>> -		rtnl_unlock();
>> +		     ++q, ++queue) {
>> +			/* Disable all interrupts */
>> +			queue_writel(queue, IDR, -1);
>> +			queue_readl(queue, ISR);
>> +			if (bp->caps & MACB_CAPS_ISR_CLEAR_ON_WRITE)
>> +				queue_writel(queue, ISR, -1);
>> +		}
>> +		/* Change interrupt handler and
>> +		 * Enable WoL IRQ on queue 0
>> +		 */
>> +		if (macb_is_gem(bp)) {
>> +			devm_free_irq(dev, bp->queues[0].irq, bp->queues);
>> +			err = devm_request_irq(dev, bp->queues[0].irq, gem_wol_interrupt,
>> +					       IRQF_SHARED, netdev->name, bp->queues);
>> +			if (err) {
>> +				dev_err(dev,
>> +					"Unable to request IRQ %d (error %d)\n",
>> +					bp->queues[0].irq, err);
>> +				return err;
>> +			}
>> +			queue_writel(bp->queues, IER, GEM_BIT(WOL));
>> +			gem_writel(bp, WOL, MACB_BIT(MAG));
>> +		} else {
>> +			queue_writel(bp->queues, IER, MACB_BIT(WOL));
>> +			macb_writel(bp, WOL, MACB_BIT(MAG));
>> +		}
>> +		spin_unlock_irqrestore(&bp->lock, flags);
>> +
>> +		enable_irq_wake(bp->queues[0].irq);
>> +	}
>> +
>> +	netif_device_detach(netdev);
>> +	for (q = 0, queue = bp->queues; q < bp->num_queues;
>> +	     ++q, ++queue)
>> +		napi_disable(&queue->napi);
>> +
>> +	if (!(bp->wol & MACB_WOL_ENABLED)) {
>> +		phy_stop(netdev->phydev);
>> +		phy_suspend(netdev->phydev);
> 
> Bug here: you must read:
> 
> 		rtnl_lock();
> 		phylink_stop(bp->phylink);
> 		rtnl_unlock();
> 
> Instead of the 2 previous lines. I'll correct in v2.
> 
> Sorry for the regression.
> 
> 
>>    		spin_lock_irqsave(&bp->lock, flags);
>>    		macb_reset_hw(bp);
>>    		spin_unlock_irqrestore(&bp->lock, flags);
>> @@ -4575,20 +4638,48 @@ static int __maybe_unused macb_resume(struct device *dev)
> 
> [..]
> BTW: I have issue having a real resume event from the phy with this
> series. I'm investigating that but didn't find anything for now.
> 
> Observation #1: when the WoL is not enabled, I don't have link issue.
> But the path in suspend/resume is far more intrusive in phy state.
> 
> Observation #2: when WoL is enabled, I need to do a full ifdown/ifup
> sequence for gain access again to the link:
> 
> ip link show eth0
> 2: eth0: <NO-CARRIER,BROADCAST,MULTICAST,UP> mtu 1500 qdisc pfifo_fast
> state DOWN mode DEFAULT group default qlen 1000
>       link/ether 54:10:ec:be:50:b0 brd ff:ff:ff:ff:ff:ff
> 
> ifdown eth0 && ifup eth0
> 
> ip link show eth0
> 2: eth0: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc pfifo_fast
> state UP mode DEFAULT group default qlen 1000
>       link/ether 54:10:ec:be:50:b0 brd ff:ff:ff:ff:ff:ff
> 
> Observation #3: I didn't experience this behavior while playing with the
> WoL on my 4.19 kernel before porting to Mainline.

I've reviewed this series to fix this last issues. It's was a 
combination of runtime_pm not handled properly and a mix of 
netif_carrier_* call with phylink calls not well positioned nor balanced 
between suspend and resume.

I have a v2 series that I'm preparing for today: Harini, I prefer to 
post it now so it could avoid that you hit the same issues as me while 
testing on your platform.

Best regards,
    Nicolas


-- 
Nicolas Ferre