lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Date:   Fri, 18 Nov 2022 13:16:00 -0500
From:   Sean Anderson <sean.anderson@...o.com>
To:     "Russell King (Oracle)" <linux@...linux.org.uk>
Cc:     Tim Harvey <tharvey@...eworks.com>,
        netdev <netdev@...r.kernel.org>,
        "David S. Miller" <davem@...emloft.net>
Subject: Re: status of rate adaptation

On 11/14/22 11:13, Russell King (Oracle) wrote:
> On Mon, Nov 14, 2022 at 10:33:52AM -0500, Sean Anderson wrote:
>> On 11/12/22 08:15, Russell King (Oracle) wrote:
>> > On Fri, Nov 11, 2022 at 04:54:40PM -0500, Sean Anderson wrote:
>> >> > [    8.911932] mvpp2 f2000000.ethernet eth0: PHY
>> >> > [f212a600.mdio-mii:08] driver [Aquantia AQR113C] (irq=POLL)
>> >> > [    8.921577] mvpp2 f2000000.ethernet eth0: phy: 10gbase-r setting
>> >> > supported 00000000,00018000,000e706f advertising
>> >> > 00000000,00018000,000e706f
>> > 
>> >> > # ethtool eth0
>> >> > Settings for eth0:
>> >> >         Supported ports: [ ]
>> >> >         Supported link modes:   10baseT/Half 10baseT/Full
>> >> >                                 100baseT/Half 100baseT/Full
>> >> 
>> >> 10/100 half duplex aren't achievable with rate matching (and we avoid
>> >> turning them on), so they must be coming from somewhere else. I wonder
>> >> if this is because PHY_INTERFACE_MODE_SGMII is set in
>> >> supported_interfaces.
>> > 
>> > The reason is due to the way phylink_bringup_phy() works. This is
>> > being called with interface = 10GBASE-R, and the PHY is a C45 PHY,
>> > which means we call phy_get_rate_matching() with 
>> > PHY_INTERFACE_MODE_NA as we don't know whether the PHY will be
>> > switching its interface or not.
>> > 
>> > Looking at the Aquanta PHY driver, this will return that pause mode
>> > rate matching will be used, so config.rate_matching will be
>> > RATE_MATCH_PAUSE.
>> > 
>> > phylink_validate() will be called for PHY_INTERFACE_MODE_NA, which
>> > causes it to scan all supported interface modes (as again, we don't
>> > know which will be used by the PHY [*]) and the union of those
>> > results will be used.
>> > 
>> > So when we e.g. try SGMII mode, caps & mac_capabilities will allow
>> > the half duplex modes through.
>> > 
>> > Now for the bit marked with [*] - at this point, if rate matching is
>> > will be used, we in fact know which interface mode is going to be in
>> > operation, and it isn't going to change. So maybe we need this instead
>> > in phylink_bringup_phy():
>> > 
>> > -	if (phy->is_c45 &&
>> > +	config.rate_matching = phy_get_rate_matching(phy, interface);
>> > +	if (phy->is_c45 && config.rate_matching == RATE_MATCH_NONE &&
>> >             interface != PHY_INTERFACE_MODE_RXAUI &&
>> >             interface != PHY_INTERFACE_MODE_XAUI &&
>> >             interface != PHY_INTERFACE_MODE_USXGMII)
>> >                 config.interface = PHY_INTERFACE_MODE_NA;
>> >         else
>> >                 config.interface = interface;
>> > -	config.rate_matching = phy_get_rate_matching(phy, config.interface);
>> > 
>> >         ret = phylink_validate(pl, supported, &config);
>> > 
>> > ?
>> 
>> Yeah, that sounds reasonable. Actually, this was the logic I was
>> thinking of when I asked Tim to try USXGMII earlier. The funny thing is
>> that the comment above this implies that the link mode is never actually
>> (R)XAUI or USXGMII.
> 
> I think you're misunderstanding the comment...
> 
> If a clause 45 PHY is using USXGMII, then it is highly likely that the
> PHY will not switch between different interface modes depending on the
> media side negotiation.
> 
> If a clause 45 PHY is using RXAUI or XAUI, then I believe according to
> the information available to me at the time, that there is no
> possibility of different interface modes being used.
> 
> If any other interface type is specified (e.g. 10GBASE-R etc) then there
> is the possibility that the PHY will be switching between different
> interface modes, and we have no idea what so ever at this point what
> modes the PHY will be making use of - so the best we can do is to
> validate _all_ possible modes. This is what is done by setting the
> interface mode to _NA.
> 
> Obviously, if we are using rate matching with a particular interface
> mode (e.g. 10GBASE-R) then we know that we are only going to be using
> 10GBASE-R, so we can validate just the single interface mode.

Ah, you're right, I was reading this backwards.

>> On another subject, if setting the SERDES mode field above fixes the
>> issue, then the Aquantia driver should be modified to set that field to
>> use a supported interface. Will host_interfaces work for this? It seems
>> to be set only when there's an SFP module.
> 
> The reason I didn't push host_interfaces upstream myself is that I was
> unconvinced that it was the proper approach - and I still have my
> reservations with it. This can only tell the PHY driver what the MAC
> driver supports, and it means the PHY driver is then free to do its
> own choosing of what group of interface modes it wants to use.

Well, this is what we have already. The Aquantia phys are initialized by
the firmware to use particular interface for a particular link speed.
Rate adaptation may or may not be involved. If it picks an interface the
MAC doesn't support, you're SOL (until you get a new firmware). Ideally,
we'd be able to configure the phy to always select an interface the MAC
supports.

> However, think about what I've said above about phylink not having any
> clue about what interface modes the PHY is going to be using - having
> the PHY driver decide on its own which group of interface modes should
> be used adds even more complexity in a completely different chunk of
> code, one where driver authors are free to make whatever decisions
> they deem (and we know that wildly different solutions will happen.)
> 
> I had been toying with the idea of doing this differently, and had
> dropped most of the host_interfaces approach from my git tree, instead
> having PHYs provide a bitmap of the interface modes they support and
> having them initialise in their config_init function which interface
> modes they're going to be making use of given their resulting
> configuration. I never properly finished this though.

That sounds like a good start.

>> That said, imagine if Tim was using a MAC without pause support, but
>> which supported SGMII and 10GBASE-R. Currently, we would just advertise
>> 10G modes. But 1G could be supported by switching the phy interface.
> 
> Note that we already have boards that make use of interface switching.
> Macchiatobin has switched between 10GBASE-R, 5GBASE-R, 2500BASE-X and
> SGMII depending on the negotiated media speed. In fact, that switching
> is rather enforced by the 3310 PHY firmware.
> 
> We could force 10GBASE-R and enable rate matching, but then we get
> into the problems that the 3310 on these boards does not have MACSEC
> therefore can't send pause frames to the host MAC (and the host MAC
> doesn't support pause frames - eek) and we have not come up with an
> implementation for extending the IPG, although I believe mvpp2
> hardware is capable of it.

The DPAA hardware is as well. As I understand it, the 10GBASE-W standard
specifies linear scaling of the IPG with the packet size, not A simple
implementation could be to have MACs expose something like

	mac_rate_match_ipg_numerator,
	mac_rate_match_ipg_denominator,

With the intention that they'd do something like

	numerator = SPEED_10000,
	denominator = 9500,

and then we could multiply the speed to adjust. We could also do
something like

	int mac_minimum_speed(int base_speed);

But AIUI we are trying to move away from these sorts of things.

FWIW I don't have any 10GBASE-W hardware.

--Sean

> However, there's also the BCM84881 PHY which does the same dynamic
> switching which we can't prevent (we don't know how to!)

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ