[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <ZHoWN0uO30P/y9hv@shell.armlinux.org.uk>
Date: Fri, 2 Jun 2023 17:17:59 +0100
From: "Russell King (Oracle)" <linux@...linux.org.uk>
To: Jakub Kicinski <kuba@...nel.org>
Cc: Andrew Lunn <andrew@...n.ch>, Heiner Kallweit <hkallweit1@...il.com>,
"David S. Miller" <davem@...emloft.net>,
Eric Dumazet <edumazet@...gle.com>, Paolo Abeni <pabeni@...hat.com>,
Dan Carpenter <dan.carpenter@...aro.org>,
Oleksij Rempel <linux@...pel-privat.de>, netdev@...r.kernel.org
Subject: Re: [PATCH net-next] net: phylib: fix phy_read*_poll_timeout()
On Fri, Jun 02, 2023 at 09:05:39AM -0700, Jakub Kicinski wrote:
> On Fri, 2 Jun 2023 09:53:09 +0100 Russell King (Oracle) wrote:
> > > Yes it is :) All this to save the single line of assignment
> > > after the read_poll_timeout() "call" ?
> >
> > Okay, so it seems you don't like it. We can't fix it then, and we'll
> > have to go with the BUILD_BUG_ON() forcing all users to use a signed
> > varable (which better be larger than a s8 so negative errnos can fit)
> > or we just rely on Dan to report the problems.
>
> Wait, did the version I proposed not work?
>
> https://lore.kernel.org/all/20230530121910.05b9f837@kernel.org/
If we're into the business of throwing web URLs at each other for
messages we've already read, here's my one for you which contains
the explanation why your one is broken, and proposing my solution.
https://lore.kernel.org/all/ZHZmBBDSVMf1WQWI@shell.armlinux.org.uk/
To see exactly why yours is broken, see the paragraph starting
"The elephant in the room..."
If it needs yet more explanation, which clearly it does, then let's
look at what genphy_loopback is doing:
ret = phy_read_poll_timeout(phydev, MII_BMSR, val,
val & BMSR_LSTATUS,
5000, 500000, true);
Now, with your supposed "fix" of:
+ int __ret, __val; \
+ \
+ __ret = read_poll_timeout(phy_read, __val, __val < 0 || (cond), \
sleep_us, timeout_us, sleep_before_read, phydev, regnum); \
This ends up being:
int __ret, __val;
__ret = read_poll_timeout(phy_read, __val, __val < 0 || (val & BMSR_LSTATUS),
sleep_us, timeout_us, sleep_before_read, phydev, regnum);
and that expands to something that does this:
__val = phy_read(phydev, regnum);
if (__val < 0 || (val & BMSR_LSTATUS))
break;
Can you spot the bug yet? Where does "val" for the test "val & BMSR_LSTATUS"
come from?
A bigger hint. With the existing code, this would have been:
val = phy_read(phydev, regnum);
if (val < 0 || (val & BMSR_LSTATUS))
break;
See the difference? val & BMSR_LSTATUS is checking the value that was
returned from phy_read() here, but in yours, it's checking an
uninitialised variable.
With my proposal, this becomes:
val = __val = phy_read(phydev, regnum);
if (__val < 0 || (val & BMSR_LSTATUS))
break;
where "val" is whatever type the user chose, which has absolutely _no_
bearing what so ever on whether the test for __val < 0 can be correctly
evaluated, and makes that test totally independent of whatever type the
user chose.
--
RMK's Patch system: https://www.armlinux.org.uk/developer/patches/
FTTP is here! 80Mbps down 10Mbps up. Decent connectivity at last!
Powered by blists - more mailing lists