[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Message-ID: <c63033ff-ff47-4008-b7d3-f07d016496fa@lunn.ch>
Date: Tue, 17 Dec 2024 17:59:54 +0100
From: Andrew Lunn <andrew@...n.ch>
To: Christophe Leroy <christophe.leroy@...roup.eu>
Cc: "David S. Miller" <davem@...emloft.net>,
Eric Dumazet <edumazet@...gle.com>,
Jakub Kicinski <kuba@...nel.org>, Paolo Abeni <pabeni@...hat.com>,
Simon Horman <horms@...nel.org>,
"Eric W. Biederman" <ebiederm@...ssion.com>,
linux-kernel@...r.kernel.org, netdev@...r.kernel.org,
Maxime Chevallier <maxime.chevallier@...tlin.com>,
TRINH THAI Florent <florent.trinh-thai@...soprasteria.com>,
CASAUBON Jean Michel <jean-michel.casaubon@...soprasteria.com>
Subject: Re: [PATCH net] net: sysfs: Fix deadlock situation in sysfs accesses
On Tue, Dec 17, 2024 at 05:18:40PM +0100, Christophe Leroy wrote:
>
>
> Le 17/12/2024 à 16:30, Andrew Lunn a écrit :
> > On Tue, Dec 17, 2024 at 08:18:25AM +0100, Christophe Leroy wrote:
> > > The following problem is encountered on kernel built with
> > > CONFIG_PREEMPT. An snmp daemon running with normal priority is
> > > regularly calling ioctl(SIOCGMIIPHY).
> >
> > Why is an SNMP daemon using that IOCTL? What MAC driver is this? Is it
> > using phylib? For phylib, that IOCTL is supposed to be for debug only,
> > and is a bit of a foot gun. So i would not recommend it.
> >
>
> That's the well-known Net-SNMP package.
>
> See for instance https://github.com/net-snmp/net-snmp/blob/master/agent/mibgroup/if-mib/data_access/interface_linux.c#L954
That is pretty broken:
It assumes the PHY is using C22. Many PHYs now a days are C45.
It assumes the PHY only supports up to 1G, were as many PHYs now a
days are > 1G.
It assumes the PHY is not an automotive PHY which has its registers in
a different place.
Reading the BMSR can change the BMSR, so phylib is going to get
confused and miss linkup/linkdown events.
There is no locking going on, so the PHY might be on a different page,
e.g. to read the temperature sensors, blink the LEDs, etc. The SNMP
daemon has no way to detect this, so it will be applying BMSR, BMCR,
etc meaning to registers which are in fact not BMSR, BMCR, etc.
This code needs throwing away and replacing with netlink sockets,
which is a lot more abstract API, PHY independent, speed independent,
media independent etc. That would also solve your deadlock.
Andrew
Powered by blists - more mailing lists