[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <aAfSMh_kNre5mxyT@shell.armlinux.org.uk>
Date: Tue, 22 Apr 2025 18:30:26 +0100
From: "Russell King (Oracle)" <linux@...linux.org.uk>
To: Andrew Lunn <andrew@...n.ch>
Cc: Jakub Kicinski <kuba@...nel.org>,
Alexander Duyck <alexander.duyck@...il.com>, netdev@...r.kernel.org,
hkallweit1@...il.com, davem@...emloft.net, pabeni@...hat.com
Subject: Re: [net-next PATCH 0/2] net: phylink: Fix issue w/ BMC link flap
On Tue, Apr 22, 2025 at 06:49:54PM +0200, Andrew Lunn wrote:
> > > The whole concept of a multi-host NIC is new to me. So i at least need
> > > to get up to speed with it. I've no idea if Russell has come across it
> > > before, since it is not a SoC concept.
> > >
> > > I don't really want to agree to anything until i do have that concept
> > > understood. That is part of why i asked about a standard. It is a
> > > dense document answering a lot of questions. Without a standard, i
> > > need to ask a lot of questions.
> >
> > Don't hesitate to ask the questions, your last reply contains no
> > question marks :)
>
> O.K. Lets start with the basics. I assume the NIC has a PCIe connector
> something like a 4.0 x4? Each of the four hosts in the system
> contribute one PCIe lane. So from the host side it looks like a 4.0 x1
> NIC?
>
> There are not 4 host MACs connected to a 5 port switch. Rather, each
> host gets its own subset of queues, DMA engines etc, for one shared
> MAC. Below the MAC you have all the usual PCS, SFP cage, gpios, I2C
> bus, and blinky LEDs. Plus you have the BMC connected via an RMII like
> interface.
>
> You must have a minimum of firmware on the NIC to get the MAC into a
> state the BMC can inject/receive frames, configure the PCS, gpios to
> the SFP, enough I2C to figure out what the module is, what quirks are
> needed etc.
This all makes sense, but at this point, I have to ask something that
seems to be fundamental to me:
Should any of the hosts accessing the NIC through those PCIe x1
interfaces have any knowledge or control of anything behind "their"
view of the MAC?
I would say no, they should not, because if they do, they can interfere
with other hosts. Surely only the BMC should have permission to access
the layers of hardware behind the MAC?
What should a host know about the setup? Maybe the speed of their
network connection through the MAC. I state it that way rather than
"the speed of the media" because if there is some control over the
traffic from each "host" then the media speed is irrelevant.
> NC-SI, with Linux controlling the hardware, implies you need to be
> able to hand off control of the GPIOs, I2C, PCS to Linux. But with
> multi-host, it makes no sense for all 4 hosts to be trying to control
> the GPIOs, I2C, PCS, perform SFP firmware upgrade. So it seems more
> likely to me, one host gets put in change of everything below the
> queues to the MAC. The others just know there is link, nothing more.
Ouch. Yes - if we have four independent hosts trying to access the same
I2C hardware as another host on the same hardware, then that sounds
like a recipe for a trainwreck.
> This actually circles back to the discussion about fixed-link. The one
> host in control of all the lower hardware has the complete
> picture. The other 3 maybe just need a fixed link. They don't get to
> see what is going on below the MAC, and as a result there is no
> ethtool support to change anything, and so no conflicting
> configuration? And since they cannot control any of that, they cannot
> put the link down. So 3/4 of the problem is solved.
Should one host have control, or should the BMC have control? I don't
actually know what you're talking about w.r.t. DSP0222 or whatever it
was, nor NC-SI - I don't have these documents.
> phylink is however not expecting that when phylink_start() is called,
> it might or might not have to drive the hardware depending on if it
> wins an election to control the hardware. And if it losses, it needs
> to ditch all its configuration for a PCS, SPF, etc and swap to a
> fixed-link. Do we want to teach phylink all this, or put all phylink
> stuff into open(), rather than spread across probe() and open(). Being
> in open(), you basically construct a different phylink configuration
> depending on if you win the election or not.
That sounds very complicated and all very new stuff.
> Is one host in the position to control the complete media
> configuration? Could you split the QSFP into four, each host gets its
> own channel, and it gets to choose how to use that channel, different
> FEC schemes, bit rates?
Yes, each channel in a QSFPs have separate LOS status bits accessible
over I2C. It's been a while since I looked at this, but I seem to
remember there aren't hardware pins for LOS, TX_DISABLE etc - that's
all over I2C.
--
RMK's Patch system: https://www.armlinux.org.uk/developer/patches/
FTTP is here! 80Mbps down 10Mbps up. Decent connectivity at last!
Powered by blists - more mailing lists