linux-kernel - RE: [PATCH net v1] mlxbf-gige: Support workaround for MDIO GPIO degradation bug

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID:
 <CH3PR12MB7738C25C6403C3C29538DA4BD78BA@CH3PR12MB7738.namprd12.prod.outlook.com>
Date: Thu, 8 May 2025 14:53:19 +0000
From: Asmaa Mnebhi <asmaa@...dia.com>
To: Andrew Lunn <andrew@...n.ch>
CC: "davem@...emloft.net" <davem@...emloft.net>, "edumazet@...gle.com"
	<edumazet@...gle.com>, "kuba@...nel.org" <kuba@...nel.org>,
	"pabeni@...hat.com" <pabeni@...hat.com>, "netdev@...r.kernel.org"
	<netdev@...r.kernel.org>, "linux-kernel@...r.kernel.org"
	<linux-kernel@...r.kernel.org>, David Thompson <davthompson@...dia.com>
Subject: RE: [PATCH net v1] mlxbf-gige: Support workaround for MDIO GPIO
 degradation bug

> > > My reading of this is that you can stop the clock when it is not
> > > needed. Maybe tie into the Linux runtime power management framework.
> > > It can keep track of how long a device has been idle, and if a timer
> > > is exceeded, make a callback to power it down.
> > >
> > > If you have an MDIO bus with one PHY on it, the access pattern is
> > > likely to be a small bunch of reads followed by about one second of
> > > idle time. I would of thought that stopping the clock increases the
> > > life expectancy of you hardware more than just slowing it down.
> >
> > Hi Andrew,
> >
> 
> > Thank you for your answer and apologies for the very late
> > response. My concern with completely stopping the clock is the case
> > we are using the PHY polling mode for the link status? We would need
> > MDIO to always be operational for polling to work, wouldn't we?
> 
> You should look at how power management work. For example, in the FEC
> driver:
> 
> https://elixir.bootlin.com/linux/v6.14.5/source/drivers/net/ethernet/freescale/f
> ec_main.c#L2180
> 
> static int fec_enet_mdio_read_c22(struct mii_bus *bus, int mii_id, int regnum)
> {
> 	struct fec_enet_private *fep = bus->priv;
> 	struct device *dev = &fep->pdev->dev;
> 	int ret = 0, frame_start, frame_addr, frame_op;
> 
> 	ret = pm_runtime_resume_and_get(dev);
> 	if (ret < 0)
> 		return ret;
> 
> This will use runtime PM to get the clocks ticking.
> 
> 
> 	/* C22 read */
> 	frame_op = FEC_MMFR_OP_READ;
> 	frame_start = FEC_MMFR_ST;
> 	frame_addr = regnum;
> 
> 	/* start a read op */
> 	writel(frame_start | frame_op |
> 	       FEC_MMFR_PA(mii_id) | FEC_MMFR_RA(frame_addr) |
> 	       FEC_MMFR_TA, fep->hwp + FEC_MII_DATA);
> 
> 	/* wait for end of transfer */
> 	ret = fec_enet_mdio_wait(fep);
> 	if (ret) {
> 		netdev_err(fep->netdev, "MDIO read timeout\n");
> 		goto out;
> 	}
> 
> 	ret = FEC_MMFR_DATA(readl(fep->hwp + FEC_MII_DATA));
> 
> This all does the MDIO bus transaction
> 
> out:
> 	pm_runtime_mark_last_busy(dev);
> 	pm_runtime_put_autosuspend(dev);
> 
> And then tell PM that we are done. In this case, i _think_ it starts a
> timer, and if there is no more MDIO activity for a while, the clocks
> get disabled.
> 
> The same is done for write.
> 
> PHY polling happens once per second, using these methods, nothing
> special. So the clock will get enabled on the first read, polling can
> need to read a few registers, so due to the timer, the clock is left
> ticking between these reads, and then after a while the clock is
> disabled.
> 
> My guess is, you can have the clock disabled 80% of the time, which is
> probably going to be a better way to stop the magic smoke escaping
> from your hardware than slowing down the clock.

Sweet! Thank you very much Andrew! I will make the changes and send a new patch soon.

Asmaa