linux-kernel - Re: [PATCH 5/5] net: ch9200: avoid triggering NWay restart on non-zero PHY ID

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <aAE5wmi6MoYMzui7@gmail.com>
Date: Thu, 17 Apr 2025 18:27:04 +0100
From: Qasim Ijaz <qasdev00@...il.com>
To: Andrew Lunn <andrew@...n.ch>
Cc: Jakub Kicinski <kuba@...nel.org>, davem@...emloft.net,
	edumazet@...gle.com, pabeni@...hat.com, horms@...nel.org,
	linux-usb@...r.kernel.org, netdev@...r.kernel.org,
	linux-kernel@...r.kernel.org,
	syzbot+3361c2d6f78a3e0892f9@...kaller.appspotmail.com,
	stable@...r.kernel.org
Subject: Re: [PATCH 5/5] net: ch9200: avoid triggering NWay restart on
 non-zero PHY ID

On Thu, Apr 17, 2025 at 04:08:08PM +0200, Andrew Lunn wrote:
> On Thu, Apr 17, 2025 at 02:12:36PM +0100, Qasim Ijaz wrote:
> > On Tue, Apr 15, 2025 at 08:56:48PM -0700, Jakub Kicinski wrote:
> > > On Tue, 15 Apr 2025 20:52:30 -0700 Jakub Kicinski wrote:
> > > > On Tue, 15 Apr 2025 03:35:07 +0200 Andrew Lunn wrote:
> > > > > > @@ -182,7 +182,7 @@ static int ch9200_mdio_read(struct net_device *netdev, int phy_id, int loc)
> > > > > >  		   __func__, phy_id, loc);
> > > > > >  
> > > > > >  	if (phy_id != 0)
> > > > > > -		return -ENODEV;
> > > > > > +		return 0;    
> > > > > 
> > > > > An actually MDIO bus would return 0xffff is asked to read from a PHY
> > > > > which is not on the bus. But i've no idea how the ancient mii code
> > > > > handles this.
> > > > > 
> > > > > If this code every gets updated to using phylib, many of the changes
> > > > > you are making will need reverting because phylib actually wants to
> > > > > see the errors. So i'm somewhat reluctant to make changes like this.  
> > > > 
> > > > Right.
> > > > 
> > > > I mean most of the patches seem to be adding error checking, unlike
> > > > this one, but since Qasim doesn't have access to this HW they are
> > > > more likely to break stuff than fix. I'm going to apply the first
> > > > patch, Qasim if you'd like to clean up the rest I think it should
> > > > be done separately without the Fixes tags, if at all.
> > > 
> > > Ah, no, patch 1 also does return 0. Hm. Maybe let's propagate the real
> > > error to silence the syzbot error and if someone with access to the HW
> > 
> > Hi Andrew and Jakub
> > 
> > Since there is uncertainty on whether these patches would break things 
> > how about I refactor the patches to instead return what the function 
> > already returns, this way we include error handling but maintain consistency 
> > with what the function already returns and does so there is no chance of 
> > breaking stuff. I think including the error handling would be a good idea
> > overall because we have already seen 1 bug where the root cause is insufficient 
> > error handling right? Furthermore this driver has not been updated in 4 years, 
> > so for the near‑term surely improving these aspects can only be a good thing.
> 
> It is not a simple thing to decided if we should make changes or not,
> if we don't have the hardware. The test robot is saying things are
> potentially wrong, but we don't have any users complaining it is
> broken. If we make the test robot happy, without testing the changes,
> we can make users unhappy by breaking it. And that is the opposite of
> what we want.

For patch 2 the kernel test robot said it is missing a Cc stable tag in
the sign-off area, it didnt highlight any build or functional errors so
I don't understand what you mean there.

> 
> We also need to think about "return on investment". Is anybody
> actually using this device still? Would it be better to spend our time
> on other devices we know are actually used?
> 
> If you can find a board which actually has this device, or can find
> somebody to run tests, then great, we are likely to accept them. But
> otherwise please focus on minimum low risk changes which are obviously

So going forward what should we do? I gave my thoughts for each
patch above and how I think we should change it to minimise
breaking things while adding error handling, which ones do you 
agree/ don't agree with?

Thanks
Qasim
> correct, or just leave the test robot unhappy.
> 
> 	Andrew