[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <0186263e-bcb6-d144-5e6d-23400179ca38@seco.com>
Date: Wed, 7 Sep 2022 18:39:34 -0400
From: Sean Anderson <sean.anderson@...o.com>
To: "Russell King (Oracle)" <linux@...linux.org.uk>
Cc: netdev@...r.kernel.org, Andrew Lunn <andrew@...n.ch>,
Heiner Kallweit <hkallweit1@...il.com>,
Jakub Kicinski <kuba@...nel.org>,
Paolo Abeni <pabeni@...hat.com>,
Vladimir Oltean <olteanv@...il.com>,
Alexandru Marginean <alexandru.marginean@....com>,
linux-kernel@...r.kernel.org,
"David S . Miller" <davem@...emloft.net>,
Eric Dumazet <edumazet@...gle.com>
Subject: Re: [PATCH net-next v5 1/8] net: phylink: Document MAC_(A)SYM_PAUSE
On 9/7/22 17:01, Russell King (Oracle) wrote:
> On Wed, Sep 07, 2022 at 04:11:14PM -0400, Sean Anderson wrote:
>> On 9/7/22 2:04 PM, Russell King (Oracle) wrote:
>>> MAC_SYM_PAUSE and MAC_ASYM_PAUSE are used when configuring our autonegotiation
>>> advertisement. They correspond to the PAUSE and ASM_DIR bits defined by 802.3,
>>> respectively.
>>
>> My intention here is to clarify the relationship between the terminology. Your
>> proposed modification has "our autonegotiation advertisement" apply to PAUSE/ASM_DIR
>> instead of MAC_*_PAUSE, which is also confusing, since those bits can apply to either
>> party's advertisement.
>
> Please amend to make it clearer.
Does what I proposed work?
>>>>>> + * MAC_SYM_PAUSE MAC_ASYM_PAUSE Valid pause modes
>>>>>> + * ============= ============== ==============================
>>>>>> + * 0 0 MLO_PAUSE_NONE
>>>>>> + * 0 1 MLO_PAUSE_NONE, MLO_PAUSE_TX
>>>>>> + * 1 0 MLO_PAUSE_NONE, MLO_PAUSE_TXRX
>>>>>> + * 1 1 MLO_PAUSE_NONE, MLO_PAUSE_TXRX,
>>>>>> + * MLO_PAUSE_RX
>>>>>
>>>>> Any of none, tx, txrx and rx can occur with both bits set in the last
>>>>> case, the tx-only case will be due to user configuration.
>>>>
>>>> What flow did you have in mind? According to the comment on linkmode_set_pause,
>>>> if ethtool requests tx-only, we will use MAC_ASYM_PAUSE for the advertising,
>>>> which is the second row above.
>>>
>>> I think you're missing some points on the ethtool interface. Let me
>>> go through it:
>>>
>>> First, let's go through the man page:
>>>
>>> autoneg on|off
>>> Specifies whether pause autonegotiation should be enabled.
>>>
>>> rx on|off
>>> Specifies whether RX pause should be enabled.
>>>
>>> tx on|off
>>> Specifies whether TX pause should be enabled.
>>>
>>> This is way too vague and doesn't convey very much inforamtion about
>>> the function of these options. One can rightfully claim that it is
>>> actually wrong and misleading, especially the first option, because
>>> there is no way to control whether "pause autonegotiation should be
>>> enabled." Either autonegotiation with the link partner is enabled
>>> or it isn't.
>>> Thankfully, the documentation of the field in struct
>>> ethtool_pauseparam documents this more fully:
>>>
>>> * If @autoneg is non-zero, the MAC is configured to send and/or
>>> * receive pause frames according to the result of autonegotiation.
>>> * Otherwise, it is configured directly based on the @rx_pause and
>>> * @tx_pause flags.
>>>
>>> So, autoneg controls whether the result of autonegotiation is used, or
>>> we override the result of autonegotiation with the specified transmit
>>> and receive settings.
>>>
>>> The next issue with the man page is that it doesn't specify that tx
>>> and rx control the advertisement of pause modes - and it doesn't
>>> specify how. Again, the documentation of struct ethtool_pauseparam
>>> helps somewhat:
>>>
>>> * If the link is autonegotiated, drivers should use
>>> * mii_advertise_flowctrl() or similar code to set the advertised
>>> * pause frame capabilities based on the @rx_pause and @tx_pause flags,
>>> * even if @autoneg is zero. They should also allow the advertised
>>> * pause frame capabilities to be controlled directly through the
>>> * advertising field of &struct ethtool_cmd.
>>>
>>> So:
>>>
>>> 1. in the case of autoneg=0:
>>> 1a. local end's enablement of tx and rx pause frames depends solely
>>> on the capabilities of the network adapter and the tx and rx
>>> parameters, ignoring the results of any autonegotiation
>>> resolution.
>>> 1b. the behaviour in mii_advertise_flowctrl() or similar code shall
>>> be used to derive the advertisement, which results in the
>>> tx=1 rx=0 case advertising ASYM_DIR only which does not tie up
>>> with what we actually end up configuring on the local end.
>>>
>>> 2. in the case of autoneg=1, the tx and rx parameters are used to
>>> derive the advertisement as in 1b and the results of
>>> autonegotiation resolution are used.
>>>
>>> The full behaviour of mii_advertise_flowctrl() is:
>>>
>>> ethtool local advertisement possible autoneg resolutions
>>> rx tx Pause AsymDir
>>> 0 0 0 0 !tx !rx
>>> 1 0 1 1 !tx !rx, !tx rx, tx rx
>>> 0 1 0 1 !tx !rx, tx !rx
>>> 1 1 1 0 !tx !rx, tx rx
>>>
>>> but as I say, the "possible autoneg resolutions" and table 28B-3
>>> is utterly meaningless when ethtool specifies "autoneg off" for
>>> the pause settings.
>>>
>>> So, "ethtool -A autoneg off tx on rx off" will result in an
>>> advertisement with PAUSE=0 ASYM_DIR=1 and we force the local side
>>> to enable transmit pause and disabel receive pause no matter what
>>> the remote side's advertisement is.
>>>
>>> I hope this clears the point up.
>>
>> My intent here is to provide some help for driver authors when they
>> need to fill in their mac capabilities. The driver author probably
>> knows things like "My device supports MLO_PAUSE_TX and MLO_PAUSE_TXRX
>> but not MLO_PAUSE_RX." They have to translate that into the correct
>> values for MAC_*_PAUSE. When the user starts messing with this process,
>> it's no longer the driver author's problem whether the result is sane
>> or not.
>
> Given that going from tx/rx back to pause/asym_dir bits is not trivial
> (because the translation depends on the remote advertisement) it is
> highly unlikely that the description would frame the support in terms
> of whether the hardware can transmit and/or receive pause frames.
I think it is? Usually if both symmetric and asymmetric pause is
possible then there are two PAUSE_TX and PAUSE_RX fields in a register
somewhere. Similarly, if there is only symmetric pause, then there is a
PAUSE_EN bit in a register. And if only one of TX and RX is possible,
then there will only be one field. There are a few drivers where you
program the advertisement and let the hardware do the rest, but even
then there's usually a manual mode (which should be enabled by the
poorly-documented permit_pause_to_mac parameter).
However, it is not obvious (at least it wasn't to me)
- That MAC_SYM_PAUSE/MAC_ASYM_PAUSE control to the PAUSE and ASYM_DIR
bits (when MLO_PAUSE_AN is set).
- How MAC_*_PAUSE related to the resolved pause mode in mac_link_up.
> Note from the table above, it is not possible to advertise that you
> do not support transmission of pause frames.
Just don't set either of MAC_*_PAUSE :)
Of course, hardware manufacturers are hopefully aware that only half of
the possible combinations are supported and don't produce hardware with
capabilities that can't be advertised.
>>
>> How about
>>
>>> The following table lists the values of tx_pause and rx_pause which
>>> might be requested in mac_link_up depending on the results of> autonegotiation (when MLO_PAUSE_AN is set):>
>>> MAC_SYM_PAUSE MAC_ASYM_PAUSE tx_pause rx_pause
>>> ============= ============== ======== ========
>>> 0 0 0 0
>>> 0 1 0 0> 1 0
>>> 1 0 0 0
>>> 1 1> 1 1 0 0
>>> 0 1
>>> 1 1
>>>
>>> When MLO_PAUSE_AN is not set, any combination of tx_pause and> rx_pause may be requested. This depends on user configuration,
>>> without regard to the values of MAC_SYM_PAUSE and MAC_ASYM_PAUSE.
>
> The above is how I'm viewing this, and because of the broken formatting,
> it's impossible to make sense of, sorry.
Sorry, my mail client mangled it. Second attempt:
> MAC_SYM_PAUSE MAC_ASYM_PAUSE tx_pause rx_pause
> ============= ============== ======== ========
> 0 0 0 0
> 0 1 0 0
> 1 0
> 1 0 0 0
> 1 1
> 1 1 0 0
> 0 1
> 1 1
>> Perhaps there should be a note either here or in mac_link_up documenting
>> what to do if e.g. the user requests just MLO_PAUSE_TX but only symmetric
>> pause is supported. In mvneta_mac_link_up we enable symmetric pause if
>> either tx_pause or rx_pause is requested.
>
> If the MAC only supports symmetric pause, the logic in phylink ensures
> that the MAC will always be called with tx_pause == rx_pause:
> - it will fail attempts by ethtool to set autoneg off with different rx
> and tx settings.
> - we will only advertise support for symmetric pause, for which there
> are only two autonegotiation outcomes, both of which satisfy the
> requirement that tx_pause == rx_pause.
>
> So, if a MAC only supports symmetric pause, it can key off either of
> these two flags as it is guaranteed that they will be identical in
> for a MAC that only supports symmetric pause.
OK, so taking that into account then perhaps the post-table explanation
should be reworded to
> When MLO_PAUSE_AN is not set and MAC_ASYM_PAUSE is set, any
> combination of tx_pause and rx_pause may be requested. This depends on
> user configuration, without regard to the value of MAC_SYM_PAUSE. When
> When MLO_PAUSE_AN is not set and MAC_ASYM_PAUSE is also unset, then
> tx_pause and rx_pause will still depend on user configuration, but
> will always equal each other.
Or maybe the above table should be extended to be
> MLO_PAUSE_AN MAC_SYM_PAUSE MAC_ASYM_PAUSE tx_pause rx_pause
> ============ ============= ============== ======== ========
> 0 0 0 0 0
> 0 0 1 0 0
> 1 0
> 0 1 0 0 0
> 1 1
> 0 1 1 0 0
> 0 1
> 1 1
> 1 0 0 0 0
> 1 X 1 X X
> 1 1 0 0 0
> 1 1
With a note like
> When MLO_PAUSE_AN is not set, the values of tx_pause and rx_pause
> depend on user configuration. When MAC_ASYM_PAUSE is not set, tx_pause
> and rx_pause will be restricted to be either both enabled or both
> disabled. Otherwise, no restrictions are placed on their values,
> allowing configurations which would not be attainable as a result of
> autonegotiation.
IMO we should really switch to something like MAX_RX_PAUSE,
MAC_TX_PAUSE, MAC_RXTX_PAUSE and let phylink handle all the details of
turning that into sane advertisement. This would also let us return
-EINVAL in phylink_ethtool_set_pauseparam when the user requests e.g.
TX-only pause when the MAC only supports RX and RXTX.
> Adding in the issue of rate adaption (sorry, I use "adaption" not
> "adaptation" which I find rather irksome as in my - and apparently
> a subsection of English speakers, the two have slightly different
> meanings)
802.3 calls it "rate adaptation" in clause 49 (10GBASE-R) and "rate
matching" in clause 61 (10PASS-TL and 2BASE-TS). If you are opposed to
the former, then I think the latter could also work. It's also shorter,
which is definitely a plus.
Interestingly, wiktionary (with which I attempted to determine what that
slightly-different meaning was) labels "adaption" as "rare" :)
> brings with it the problem that when using pause frames,
> we need RX pause enabled, but on a MAC which only supports symmetric
> pause, we can't enable RX pause without also transmitting pause frames.
> So I would say such a setup was fundamentally mis-designed, and there's
> little we can do to correct such a stupidity. Should we detect such
> stupidities? Maybe, but what then? Refuse to function?
Previous discussion [1]. Right now we refuse to turn on rate adaptation
if the MAC only supports symmetric pause. The maximally-robust solution
would be to first try and autonegotiate with rate adaptation enabled and
using symmetric pause, and then renegotiate without either enabled. I
think that's significantly more complex, so I propose deferring such an
implementation to whoever first complains about their link not being
rate-adapted.
--Sean
[1] https://lore.kernel.org/netdev/4172fd87-8e51-e67d-bf86-fdc6829fa9b3@seco.com/
Powered by blists - more mailing lists