linux-kernel - RE: [PATCH net v1] mlxbf-gige: Support workaround for MDIO GPIO degradation bug

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID:
 <CH3PR12MB773870BA2AA47223FF9A72D7D745A@CH3PR12MB7738.namprd12.prod.outlook.com>
Date: Fri, 27 Jun 2025 19:00:50 +0000
From: Asmaa Mnebhi <asmaa@...dia.com>
To: Andrew Lunn <andrew@...n.ch>
CC: "davem@...emloft.net" <davem@...emloft.net>, "edumazet@...gle.com"
	<edumazet@...gle.com>, "kuba@...nel.org" <kuba@...nel.org>,
	"pabeni@...hat.com" <pabeni@...hat.com>, "netdev@...r.kernel.org"
	<netdev@...r.kernel.org>, "linux-kernel@...r.kernel.org"
	<linux-kernel@...r.kernel.org>, David Thompson <davthompson@...dia.com>
Subject: RE: [PATCH net v1] mlxbf-gige: Support workaround for MDIO GPIO
 degradation bug



> -----Original Message-----
> From: Asmaa Mnebhi <asmaa@...dia.com>
> Sent: Thursday, May 8, 2025 10:53 AM
> To: Andrew Lunn <andrew@...n.ch>
> Cc: davem@...emloft.net; edumazet@...gle.com; kuba@...nel.org;
> pabeni@...hat.com; netdev@...r.kernel.org; linux-kernel@...r.kernel.org;
> David Thompson <davthompson@...dia.com>
> Subject: RE: [PATCH net v1] mlxbf-gige: Support workaround for MDIO GPIO
> degradation bug
> 
> > > > My reading of this is that you can stop the clock when it is not
> > > > needed. Maybe tie into the Linux runtime power management
> framework.
> > > > It can keep track of how long a device has been idle, and if a
> > > > timer is exceeded, make a callback to power it down.
> > > >
> > > > If you have an MDIO bus with one PHY on it, the access pattern is
> > > > likely to be a small bunch of reads followed by about one second
> > > > of idle time. I would of thought that stopping the clock increases
> > > > the life expectancy of you hardware more than just slowing it down.
> > >
> > > Hi Andrew,
> > >
> >
> > > Thank you for your answer and apologies for the very late response.
> > > My concern with completely stopping the clock is the case we are
> > > using the PHY polling mode for the link status? We would need MDIO
> > > to always be operational for polling to work, wouldn't we?
> >
> > You should look at how power management work. For example, in the FEC
> > driver:
> >
> > https://elixir.bootlin.com/linux/v6.14.5/source/drivers/net/ethernet/f
> > reescale/f
> > ec_main.c#L2180
> >
> > static int fec_enet_mdio_read_c22(struct mii_bus *bus, int mii_id, int
> > regnum) {
> > 	struct fec_enet_private *fep = bus->priv;
> > 	struct device *dev = &fep->pdev->dev;
> > 	int ret = 0, frame_start, frame_addr, frame_op;
> >
> > 	ret = pm_runtime_resume_and_get(dev);
> > 	if (ret < 0)
> > 		return ret;
> >
> > This will use runtime PM to get the clocks ticking.
> >
> >
> > 	/* C22 read */
> > 	frame_op = FEC_MMFR_OP_READ;
> > 	frame_start = FEC_MMFR_ST;
> > 	frame_addr = regnum;
> >
> > 	/* start a read op */
> > 	writel(frame_start | frame_op |
> > 	       FEC_MMFR_PA(mii_id) | FEC_MMFR_RA(frame_addr) |
> > 	       FEC_MMFR_TA, fep->hwp + FEC_MII_DATA);
> >
> > 	/* wait for end of transfer */
> > 	ret = fec_enet_mdio_wait(fep);
> > 	if (ret) {
> > 		netdev_err(fep->netdev, "MDIO read timeout\n");
> > 		goto out;
> > 	}
> >
> > 	ret = FEC_MMFR_DATA(readl(fep->hwp + FEC_MII_DATA));
> >
> > This all does the MDIO bus transaction
> >
> > out:
> > 	pm_runtime_mark_last_busy(dev);
> > 	pm_runtime_put_autosuspend(dev);
> >
> > And then tell PM that we are done. In this case, i _think_ it starts a
> > timer, and if there is no more MDIO activity for a while, the clocks
> > get disabled.
> >
> > The same is done for write.
> >
> > PHY polling happens once per second, using these methods, nothing
> > special. So the clock will get enabled on the first read, polling can
> > need to read a few registers, so due to the timer, the clock is left
> > ticking between these reads, and then after a while the clock is
> > disabled.
> >
> > My guess is, you can have the clock disabled 80% of the time, which is
> > probably going to be a better way to stop the magic smoke escaping
> > from your hardware than slowing down the clock.
> 
> Sweet! Thank you very much Andrew! I will make the changes and send a new
> patch soon.
> 

Hi Andrew, 

I implemented and tested the changes you suggested and the runtime_resume/suspend work smoothly for MDIO.
However, we have another issue. I noticed that even if mdio_read/write() functions are not being called, runtime_resume/suspend() are still called regularly. After investigation, I found out that this is due to ethtool being called regularly. Ethtool automatically triggers the resume/suspend even if we do no MDIO access. A different team wrote a script which monitors "ethtool -S eth0" every 60 seconds. So every minute, we are running resume/suspend and enabling/disabling the MDIO clock. Seems counter productive. That team said that it is a requirement that they collect these statistics about the mlxbf_gige interface.
Is there any way to prevent ethtool from calling resume/suspend without changing core kernel code?
If not, what would you suggest? 

Thanks.
Asmaa