netdev - Re: [PATCH net] net: phylink: fix suspend/resume with WoL enabled and link down

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives
Hash Suite for Android: free password hash cracker in your pocket
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <CAKgT0Uc_O_5wMQOG66PS2Dc2Bn3WZ_vtw2tZV8He=EU9m5LsjQ@mail.gmail.com>
Date: Thu, 17 Apr 2025 12:49:07 -0700
From: Alexander Duyck <alexander.duyck@...il.com>
To: "Russell King (Oracle)" <linux@...linux.org.uk>
Cc: Andrew Lunn <andrew@...n.ch>, Heiner Kallweit <hkallweit1@...il.com>, 
	"David S. Miller" <davem@...emloft.net>, Eric Dumazet <edumazet@...gle.com>, 
	Jakub Kicinski <kuba@...nel.org>, Joakim Zhang <qiangqing.zhang@....com>, netdev@...r.kernel.org, 
	Paolo Abeni <pabeni@...hat.com>
Subject: Re: [PATCH net] net: phylink: fix suspend/resume with WoL enabled and
 link down

On Thu, Apr 17, 2025 at 10:27 AM Russell King (Oracle)
<linux@...linux.org.uk> wrote:
>
> On Thu, Apr 17, 2025 at 10:06:45AM -0700, Alexander Duyck wrote:
> > On Thu, Apr 17, 2025 at 8:23 AM Russell King (Oracle)
> > <linux@...linux.org.uk> wrote:
> > >
> > > On Thu, Apr 17, 2025 at 03:35:56PM +0100, Russell King (Oracle) wrote:

[...]

> > > So, we have the case where we want to avoid notifying the MAC that the
> > > link has gone down, but we also want to call netif_carrier_off() to stop
> > > the network stack queueing packets.
> > >
> > > To achieve this, several things work in unison:
> > >
> > > - we take the state mutex to prevent the resolver from running while we
> > >   fiddle with various state.
> > > - we disable the resolver (which, if that's the only thing that happens,
> > >   will provoke the resolver to take the link down.)
> > > - we lie to the resolver about the link state, by calling
> > >   netif_carrier_off() and/or setting old_link_state to false. This
> > >   means the resolver believes the link is _already_ down, and thus
> > >   because of the strict ordering I've previously mentioned, will *not*
> > >   call mac_link_down().
> > > - we release the lock.
> > >
> > > There is now no way that the resolver will call either mac_link_up() or
> > > mac_link_down() - which is what we want here.
> >
> > The part that I think I missed here was that if we set
> > phylink_disable_state we then set link_state.link to false in
> > phylink_resolve. I wonder if we couldn't just have a flag that sets
> > cur_link_state to false in the "if(pl->phylink_disable_state)" case
> > and simplify this to indicate we won't force the link down"
>
> Then how does phylink_stop() end up calling .mac_link_down() ?

What I was saying is that we could add an additional state flag that
we set before you write to the phylink_disable_state. You would
essentially set the state to true if you want to preserve the current
state, and if it is true you would set cur_link_statte to false in
phylink_resolve ignoring the actual current link state.

So in phylink_stop you would set it to false, and in phylink_suspend
you would set it to true. With that change phylink_stop could force
the link down, whereas phylink_suspend would keep it up,
phylink_suspend could deal with netif_carrier_off, and phylink_resume
could deal with old_link_state.

Anyway it wasn't as if I was saying we need to make that change. I
have a bad habit of thinking out loud and it tends to carry over into
emails..

> > So fbnic_mac_link_up_asic doesn't trigger any issues. The issues are:
> >
> > 1. In fbnic_mac_link_down_asic we assume that if we are being told
> > that the link is down by phylink that it really means the link is down
> > and we need to shut off the MAC and flush any packets that are in the
> > Tx FIFO. The issue isn't so much the call itself, it is the fact that
> > we are being called in phylink_resume to rebalance the link that will
> > be going from UP->UP. The rebalancing is introducing an extra down
> > step. This could be resolved by adding an extra check to the line in
> > phylink_resume that is calling the mac_link_down so that it doesn't
> > get triggered and stall the link. In my test code I am now calling it
> > "pl->rolling_start".
>
> You're not addessing the issue I've already stated here.
>
> If the link was up, and we *don't* call mac_link_down(), we then *can*
> *not* call phylink_mac_initial_config(). It's as simple as that. The
> MAC must see link down before being configured.
>
> So, if the link was up, and we don't call mac_link_down() then we must
> also *not* call phylink_mac_initial_config(). I've no idea what will
> break with that change.

Sorry, mentioning it didn't occur to me as I have been dealing with
the "rolling start" since the beginning. In mac_prepare I deal with
this. Specifically the code in mac_prepare will check to see if the
link state is currently up with the desired configuration already or
not. If it is, it sets a flag that will keep us from doing any changes
that would be destructive to the link. If the link is down it
basically clears the way for a full reinitialization.

> > 2. In phylink_start/phylink_resume since we are coming at this from a
> > "rolling start" due to the BMC we have the MAC up and the
> > netif_carrier in the "off" state. As a result if we end up losing the
> > link between mac_prepare and pcs_get_state the MAC gets into a bad
> > state where netif_carrier is off, the MAC is stuck in the "up" state,
> > and we have stale packets clogging up the Tx FIFO.
> >
> > As far as the BMC it isn't so much wanting to hit the big red button
> > as our platform team. Basically losing of packets is very problematic
> > for them, think about ssh sessions that suddenly lock up during boot,
> > and they can demonstrate that none of the other vendors that have the
> > MAC/PCS/PHY buried in their firmware have this issue. I was really
> > hoping to avoid going that route as the whole point of this was to
> > keep the code open source and visible.
>
> The problem I have is not the idea, but the implementation. You want
> to change the phylink behaviour in a way that affects *all* existing
> users. I don't want that because of the guarantees of that contract
> I've previously stated that I've given to existing users.

That is why I mentioned "renegotiating the terms" in my other email. I
would be okay with adding a phylink_config variable for the
"rolling_start".

Essentially with "rolling_start" we would have:
1. phylink_resume would not call phylink_link_down to rebalance the state
2. phylink_resolve would have to call mac_link_up or mac_link_down
following the first call after phylink_start/resume.
3. mac_prepare would have to take on the responsibility of doing
mac_link_down prior to any destructive configuration change.
4. Calls to mac_link_up/mac_link_down would be idempotent essentially
meaning that calling up on a link that is already up has no effect
(other than maybe updating autoneg settings), and calling down on a
down link likewise would have no effect.

> As things currently stand, you have a currently unique new case, and
> you're trying to force your solution on all users potentially breaking
> them. I have no real issue with supporting the new case, but *not* at
> the expense of regressing existing phylink users.
>
> Yet, when I point out this, you seem to be dead against *not* affecting
> other users. This is where the problem is, and where we fundamentally
> disagree.

I am not sure where that impression came from. I realized after you
mentioned it that the change was too broad and I had mentioned
limiting it with a config option on the other thread.