lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <6a815a39-0a0b-b3b2-443d-11370ed7d091@gmail.com>
Date:   Wed, 23 Feb 2022 14:58:13 -0800
From:   Florian Fainelli <f.fainelli@...il.com>
To:     Jakub Kicinski <kuba@...nel.org>,
        Florian Fainelli <f.fainelli@...il.com>
Cc:     Peter Robinson <pbrobinson@...il.com>,
        Doug Berger <opendmb@...il.com>,
        "David S. Miller" <davem@...emloft.net>,
        bcm-kernel-feedback-list@...adcom.com, netdev@...r.kernel.org,
        Javier Martinez Canillas <javierm@...hat.com>
Subject: Re: [PATCH] net: bcmgenet: Return not supported if we don't have a
 WoL IRQ



On 2/23/2022 2:48 PM, Jakub Kicinski wrote:
> On Wed, 23 Feb 2022 09:54:26 -0800 Florian Fainelli wrote:
>>> I have no problems working with you to improve the driver, the problem
>>> I have is this is currently a regression in 5.17 so I would like to
>>> see something land, whether it's reverting the other patch, landing
>>> thing one or another straight forward fix and then maybe revisit as
>>> whole in 5.18.
>>
>> Understood and I won't require you or me to complete this investigating
>> before fixing the regression, this is just so we understand where it
>> stemmed from and possibly fix the IRQ layer if need be. Given what I
>> just wrote, do you think you can sprinkle debug prints throughout the
>> kernel to figure out whether enable_irq_wake() somehow messes up the
>> interrupt descriptor of interrupt and test that theory? We can do that
>> offline if you want.
> 
> Let me mark v2 as Deferred for now, then. I'm not really sure if that's
> what's intended but we have 3 weeks or so until 5.17 is cut so we can
> afford a few days of investigating.
> 
> I'm likely missing the point but sounds like the IRQ subsystem treats
> IRQ numbers as unsigned so if we pass a negative value "fun" is sort
> of expected. Isn't the problem that device somehow comes with wakeup
> capable being set already? Isn't it better to make sure device is not
> wake capable if there's no WoL irq instead of adding second check?
> 
> diff --git a/drivers/net/ethernet/broadcom/genet/bcmgenet.c b/drivers/net/ethernet/broadcom/genet/bcmgenet.c
> index cfe09117fe6c..7dea44803beb 100644
> --- a/drivers/net/ethernet/broadcom/genet/bcmgenet.c
> +++ b/drivers/net/ethernet/broadcom/genet/bcmgenet.c
> @@ -4020,12 +4020,12 @@ static int bcmgenet_probe(struct platform_device *pdev)
>   
>   	/* Request the WOL interrupt and advertise suspend if available */
>   	priv->wol_irq_disabled = true;
> -	if (priv->wol_irq > 0) {
> +	if (priv->wol_irq > 0)
>   		err = devm_request_irq(&pdev->dev, priv->wol_irq,
>   				       bcmgenet_wol_isr, 0, dev->name, priv);
> -		if (!err)
> -			device_set_wakeup_capable(&pdev->dev, 1);
> -	}
> +	else
> +		err = -ENOENT;
> +	device_set_wakeup_capable(&pdev->dev, !err);

Yes, this looks more elegant and is certainly more correct.

However there must have been something else going on with Peter's 
provided information.

We clearly did not have an interrupt registered for the Wake-on-LAN 
interrupt line as witnessed by the outputs of /proc/interrupts, however 
if we managed to go past the device_can_wakeup() check in 
bcmgenet_set_wol(), then we must have had devm_request_irq() return 
success on an invalid interrupt number or worse, botch the interrupt 
number in priv->irq1 to the point where the handler got re-installed 
maybe and we only end-up calling bcmgenet_wol_isr but no longer 
bcmgenet_isr1.. Hummm.
-- 
Florian

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ