[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <aAACyQ494eO4LFQD@shell.armlinux.org.uk>
Date: Wed, 16 Apr 2025 20:19:37 +0100
From: "Russell King (Oracle)" <linux@...linux.org.uk>
To: Alexander Duyck <alexander.duyck@...il.com>
Cc: netdev@...r.kernel.org, andrew@...n.ch, hkallweit1@...il.com,
davem@...emloft.net, kuba@...nel.org, pabeni@...hat.com
Subject: Re: [net-next PATCH 2/2] net: phylink: Fix issues with link
balancing w/ BMC present
On Wed, Apr 16, 2025 at 12:03:05PM -0700, Alexander Duyck wrote:
> On Wed, Apr 16, 2025 at 9:01 AM Russell King (Oracle)
> <linux@...linux.org.uk> wrote:
> >
> > On Wed, Apr 16, 2025 at 08:29:00AM -0700, Alexander Duyck wrote:
> > > From: Alexander Duyck <alexanderduyck@...com>
> > >
> > > This change is meant to address the fact that there are link imbalances
> > > introduced when using phylink on a system with a BMC. Specifically there
> > > are two issues.
> > >
> > > The first issue is that if we lose link after the first call to
> > > phylink_start but before it gets to the phylink_resolve we will end up with
> > > the phylink interface assuming the link was always down and not calling
> > > phylink_link_down resulting in a stuck interface.
> >
> > That is intentional.
> >
> > phylink strictly orders .mac_link_down and .mac_link_up, and starts from
> > an initial position that the link _will_ be considered to be down. So,
> > it is intentional that .mac_link_down will _never_ be called after
> > phylink_start().
>
> Well the issue is that with a BMC present the link may be up before we
> even start using phylink. So if the link is lost while we are going
> through phylink_start we will end up in an in-between state where the
> link is physically down, but the MAC is still configured as though the
> link is up. This will be problematic as the MAC should essentially be
> discarding frames for transmit if the link is down to avoid blocking
> internal Tx FIFOs.
So, when a Linux network driver probes, it starts out in administrative
state *DOWN*. When the administrator configures the network driver,
.ndo_open is called, which is expected to configure the network adapter.
Part of that process is to call phylink_start() as one of the last
steps, which detects whether the link is up or not. If the link is up,
then phylink will call netif_carrier_on() and .mac_link_up(). This
tells the core networking layers that the network interface is now
ready to start sending packets, and it will begin queueing packets for
the network driver to process - not before.
Prior to .ndo_open being called, the networking layers do not expect
traffic from the network device no matter what physical state the
media link is in. If .ndo_open fails, the same applies - no traffic is
expected to be passed to the core network layers from the network
layers because as far as the network stack is concerned, the interface
is still administratively down.
Thus, the fact that your BMC thinks that the link is up is irrelevant.
So, start off in a state that packet activity is suspended even if the
link is up at probe time. Only start packet activity (reception and
transmission) once .mac_link_up() has been called. Stop that activity
when .mac_link_down() is subsequently called.
There have been lots of NICs out there where the physical link doesn't
follow the adminstrative state of the network interface. This is not a
problem. It may be desirable that it does, but a desire is not the same
as "it must".
> > > The second issue is that when a BMC is present we are currently forcing the
> > > link down. This results in us bouncing the link for a fraction of a second
> > > and that will result in dropped packets for the BMC.
> >
> > ... but you don't explain how that happens.
>
> It was right there in the patch. It was the lines I removed:
... thus further breaking phylink guarantees.
Sorry, but no.
> The issue, even with your recent patch, is that it will still force
> the link down if the link was previously up. That is the piece I need
> to avoid to prevent the BMC from losing link. Ideally what I need is
> to have a check of the current link state and then sync back up rather
> than force the phylink state on the MAC and then clean things up after
> the fact.
So don't force the link, just stop packet activity. As stated above,
nothing requires that the physical link is forced down just because
.mac_link_down() has been called.
This makes me wonder what happens to your BMC with your ideas if
userspace takes down the network interface. It sounds like all hell
breaks loose because you've taken the link down, and that's seen as
a critical failure... So taking down a network interface becomes a
critical failure - yet it's a *normal* userspace operation.
--
RMK's Patch system: https://www.armlinux.org.uk/developer/patches/
FTTP is here! 80Mbps down 10Mbps up. Decent connectivity at last!
Powered by blists - more mailing lists