lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <YofidJtb+kVtFr6L@lunn.ch>
Date:   Fri, 20 May 2022 20:48:20 +0200
From:   Andrew Lunn <andrew@...n.ch>
To:     Jakub Kicinski <kuba@...nel.org>
Cc:     netdev@...r.kernel.org, linux@...linux.org.uk, olteanv@...il.com,
        hkallweit1@...il.com, f.fainelli@...il.com, saeedm@...dia.com,
        michael.chan@...adcom.com
Subject: Re: [RFC net-next] net: track locally triggered link loss

> I was looking at bnxt because it's relatively standard for DC NICs and
> doesn't have 10M lines of code.. then again I could be misinterpreting
> the code, I haven't tested this theory:
> 
> In bnxt_set_pauseparam() for example the driver will send a request to
> the FW which will result in the link coming down and back up with
> different settings (e.g. when pause autoneg was changed). Since the
> driver doesn't call netif_carrier_off() explicitly as part of sending
> the FW message but the link down gets reported thru the usual interrupt
> (as if someone yanked the cable out) - we need to wrap the FW call with
> the __LINK_STATE_NOCARRIER_LOCAL

I'm not sure this is a good example. If the PHY is doing an autoneg,
the link really is down for around a second. The link peer will also
so the link go down and come back up. So this seems like a legitimate
time to set the carrier off and then back on again.

> > The driver has a few netif_carrier_off() calls changed to
> > netif_carrier_admin_off(). It is then unclear looking at the code
> > which of the calls to netif_carrier_on() match the off.
> 
> Right, for bnxt again the carrier_off in bnxt_tx_disable() would become
> an admin_carrier_off, since it's basically part of closing the netdev.

> > Maybe include a driver which makes use of phylib, which should be
> > doing control of the carrier based on the actual link status.
> 
> For phylib I was thinking of modifying phy_stop()... but I can't
> grep out where carrier_off gets called. I'll take a closer look.

If the driver is calling phy_stop() the link will go down. So again, i
would say setting the carrier off is correct. If the driver calls
phy_start() an auto neg is likely to happen and 1 second later the
link will come up.

Maybe i'm not understanding what you are trying to count here. If the
MAC driver needs to stop the MAC in order to reallocate buffers with
different MTU, or more rings etc, then i can understand not wanting to
count that as a carrier off, because the carrier does not actually go
off. But if it is in fact marking the carrier off, it sounds like a
MAC driver bug, or a firmware bug.

    Andrew

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ