[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <7a5a9201-4c26-42f8-94f2-02763f26e8c1@lunn.ch>
Date: Thu, 27 Nov 2025 16:07:19 +0100
From: Andrew Lunn <andrew@...n.ch>
To: Oleksij Rempel <o.rempel@...gutronix.de>
Cc: Jakub Kicinski <kuba@...nel.org>,
Vladimir Oltean <vladimir.oltean@....com>,
Alexei Starovoitov <ast@...nel.org>,
Russell King <linux@...linux.org.uk>,
Eric Dumazet <edumazet@...gle.com>, Rob Herring <robh@...nel.org>,
Florian Fainelli <f.fainelli@...il.com>,
Donald Hunter <donald.hunter@...il.com>,
Daniel Borkmann <daniel@...earbox.net>,
Jonathan Corbet <corbet@....net>,
John Fastabend <john.fastabend@...il.com>,
Lukasz Majewski <lukma@...x.de>,
Maxime Chevallier <maxime.chevallier@...tlin.com>,
Stanislav Fomichev <sdf@...ichev.me>,
Paolo Abeni <pabeni@...hat.com>, Jiri Pirko <jiri@...nulli.us>,
Jesper Dangaard Brouer <hawk@...nel.org>,
Divya.Koppera@...rochip.com,
Kory Maincent <kory.maincent@...tlin.com>,
Vadim Fedorenko <vadim.fedorenko@...ux.dev>, netdev@...r.kernel.org,
Sabrina Dubroca <sd@...asysnail.net>, linux-kernel@...r.kernel.org,
kernel@...gutronix.de, Krzysztof Kozlowski <krzk+dt@...nel.org>,
"David S. Miller" <davem@...emloft.net>,
Heiner Kallweit <hkallweit1@...il.com>
Subject: Re: [PATCH net-next v8 1/1] Documentation: net: add flow control
guide and document ethtool API
> Haw about following wording:
> Kernel Policy: Administrative vs. Operational State
> ===================================================
>
> The ethtool pause API configures the **administrative state** of the network
> device. The **operational state** (the actual pause behavior active on the
> wire) depends on the active link mode and the link partner.
>
> The semantics of the configuration depend on the ``autoneg`` parameter:
>
> 1. **Autonegotiation Mode** (``autoneg on``)
> In this mode, the ``rx`` and ``tx`` parameters specify the **advertisement**
> (the "wish").
>
> - The driver configures the PHY to advertise these capabilities.
> - The actual Flow Control mode is determined by the standard resolution
> truth table (see "Link-wide PAUSE Autonegotiation Details") based on the
> link partner's advertisement.
> - ``get_pauseparam`` reports the advertisement policy, not the resolved
> outcome.
>
> 2. **Forced Mode** (``autoneg off``)
> In this mode, the ``rx`` and ``tx`` parameters constitute a direct
> **command** to the interface.
>
> - The system bypasses advertisement and forces the MAC into the specified
> configuration.
> - Drivers should reject configurations that the hardware cannot support in
> forced mode.
> - ``get_pauseparam`` reports the forced configuration.
There is one additional thing which plays into this, link
autonegotiation, ethtool -s autoneg on|off.
If link auto negotiation is on, you can then have both of the two
cases above, negotiated pause, or forced pause. If link auto
negotiation is off, you can only have forced mode. The text you have
below does however cover this. But this is one of the areas developers
get wrong, they don't consider how the link autoneg affects the pause
autoneg.
But i do agree that get_pauseparam is rather odd. It returns the
current configuration, not necessarily how the MAC hardware has been
programmed.
> **Common Constraints**
> Regardless of the mode, the following constraints apply:
>
> - Link-wide PAUSE is not valid on half-duplex links.
> - Link-wide PAUSE cannot be used together with Priority-based Flow Control
> (PFC).
>
>
> /**
> * ...
> * @get_pauseparam: Report the configured administrative policy for link-wide
> * PAUSE (IEEE 802.3 Annex 31B). Drivers must fill struct ethtool_pauseparam
> * such that:
> * @autoneg:
> * This refers to **Pause Autoneg** (IEEE 802.3 Annex 31B) only
> * and is part of the link autonegotiation process.
> * true -> the device follows the negotiated result of pause
> * autonegotiation (Pause/Asym);
> * false -> the device uses a forced configuration independent
> * of negotiation.
> * @rx_pause/@...pause:
> * represent the desired policy (administrative state).
> * In autoneg mode they describe what is to be advertised;
> * in forced mode they describe the MAC configuration to be forced.
> *
> * @set_pauseparam: Apply a policy for link-wide PAUSE (IEEE 802.3 Annex 31B).
> * @rx_pause/@...pause:
> * Desired state. If @autoneg is true, these define the
> * advertisement. If @autoneg is false, these define the
> * forced MAC configuration.
> * @autoneg:
> * Select autonegotiation or forced mode.
> *
> * **Constraint Checking:**
> * Drivers should reject a non-zero setting of @autoneg when
> * autonegotiation is disabled (or not supported) for the link.
> * Drivers should reject unsupported rx/tx combinations with -EINVAL.
I'm not so keen on this last little section. What we actually want is
the drivers use phylink, and let phylink implement all the 'business
logic'. phylink will then tell the MAC driver the two bits it needs to
program the hardware. phylink does all the validation, so all a MAC
driver needs to do is call phylink_ethtool_get_pauseparam() and
phylink_ethtool_set_pauseparam(). If we say the driver reject some
combinations, we might have developers implementing that before
calling phylink_ethtool_set_pauseparam(), which is pointless, and
maybe getting it wrong.
So i would prefer something more like:
* **Constraint Checking:**
* Ideally, drivers should simply call phylink_ethtool_get_pauseparam()
* and phylink_ethtool_set_pauseparam(). phylink will then perform
* all the needed validation, and perform all the actions based on
* the current **Pause Autoneg** and link Autoneg.
*
* If phylink is not being used, the driver most perform validation,
* reject a non-zero setting of @autoneg when autonegotiation is disabled
* (or not supported) for the link. Drivers should reject unsupported rx/tx
* combinations with -EINVAL.
> Open Questions:
>
> Pre-link Configuration (Administrative UP, Physical DOWN) How should drivers
> handle set_pauseparam when the link is physically down?
You can program the PHY/PCS with what you want it to negotiate. Once
the link comes up, you can then look if you are in a half duplex mode
when determining how to program the MAC hardware.
> Parallel Detection: If the link comes up later (e.g., as Half Duplex via
> parallel detection), a previously accepted "forced pause" configuration might
> become invalid. Should we block forced pause settings until the link is
> physically up?
Forced is forced. Forced is always a potential foot gun, since you can
end up with the link peers having different ideas about what is being
used on the link. autoneg of half duplex link is just one of the
scenarios where you gain a hole in your foot.
> State Persistence and Toggling When toggling autoneg (e.g., autoneg on -> off
> -> on), should the kernel or driver cache the previous advertisement?
This has been discussed in the past, and i _think_ phylink does.
But before we go too far into edge causes, my review experience is
that MAC drivers get the basics wrong. What we really want to do here
is:
1) Push driver developers towards phylink
2) For those who don't use phylink give clear documentation of the
basics.
We can look at edge cases, but i would only do it in the context of
phylink. Its one central implementation means we can add complexity
there and not overload developers who get the basics wrong.
Andrew
Powered by blists - more mailing lists