netdev - RE: [PATCH net-next] Add Mellanox BlueField Gigabit Ethernet driver

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <CH2PR12MB3895E054D1E00168D9FFB2F0D7450@CH2PR12MB3895.namprd12.prod.outlook.com>
Date:   Tue, 11 Aug 2020 19:53:35 +0000
From:   Asmaa Mnebhi <asmaa@...dia.com>
To:     Andrew Lunn <andrew@...n.ch>,
        David Thompson <dthompson@...lanox.com>
CC:     "netdev@...r.kernel.org" <netdev@...r.kernel.org>,
        "davem@...emloft.net" <davem@...emloft.net>,
        "kuba@...nel.org" <kuba@...nel.org>,
        Jiri Pirko <jiri@...lanox.com>,
        Asmaa Mnebhi <Asmaa@...lanox.com>
Subject: RE: [PATCH net-next] Add Mellanox BlueField Gigabit Ethernet driver

Hi Andrew,

Thanks again for your feedback.

> > +	/* Finally check if this interrupt is from PHY device.
> > +	 * Return if it is not.
> > +	 */
> > +	val = readl(priv->gpio_io +
> > +			MLXBF_GIGE_GPIO_CAUSE_OR_CAUSE_EVTEN0);
> > +	if (!(val & priv->phy_int_gpio_mask))
> > +		return IRQ_NONE;
> > +
> > +	/* Clear interrupt when done, otherwise, no further interrupt
> > +	 * will be triggered.
> > +	 * Writing 0x1 to the clear cause register also clears the
> > +	 * following registers:
> > +	 * cause_gpio_arm_coalesce0
> > +	 * cause_rsh_coalesce0
> > +	 */
> > +	val = readl(priv->gpio_io +
> > +			MLXBF_GIGE_GPIO_CAUSE_OR_CLRCAUSE);
> > +	val |= priv->phy_int_gpio_mask;
> > +	writel(val, priv->gpio_io +
> > +			MLXBF_GIGE_GPIO_CAUSE_OR_CLRCAUSE);
> 
> Shoudn't there be a call into the PHY driver at this point?
> 
> > +
> > +	return IRQ_HANDLED;
> > +}
> 
> So these last three functions seem to be an interrupt controller?  So why not
> model it as a Linux interrupt controller?

Apologies for the confusion. The plan is to remove support to the polling and instead support the HW interrupt as follows (from the probe):
irq = platform_get_irq(pdev, MLXBF_GIGE_PHY_INT_N);
         if (irq < 0) {
                 dev_err(dev, "Failed to retrieve irq 0x%x\n", irq);
                 return -ENODEV;
         }
         priv->mdiobus->irq[phy_addr] = irq;

This HW interrupt is the PHY interrupt which indicates link up/link down.
The MAC driver calls phy_connect_direct, which I thought was sufficient to handle the interrupt since it calls phy_request_interrupt.
Phy_request_interrupt calls request_threaded_irq which registers phy_interrupt as a callback.
Phy_interrupt triggers the phy state machine which checks the link status. The state machine goes into phy_check_link_status which eventually calls mlxbf_gige_handle_link_change.

I guess my question is should we model it as a linux interrupt controller rather than use phy_connect_direct ? 

Using phy_connect_direct to register my interrupt handler, I have encountered a particular issue where the PHY interrupt is triggered before the phy link status bit (reg 0x1 of the PHY device) is set to 1 (indicating link is up).
So the PHY interrupt triggers the PHY state machine, which checks the link status and sees that it is still 0, so it keeps the link state as DOWN.
Adding a delay to wait for the register to be set accordingly fixes this "race condition". But it doesn't look nice.

Thank you.
Asmaa