[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20240820123456.qbt4emjdjg5pouym@skbuf>
Date: Tue, 20 Aug 2024 15:34:56 +0300
From: Vladimir Oltean <olteanv@...il.com>
To: Furong Xu <0x1207@...il.com>
Cc: Serge Semin <fancer.lancer@...il.com>, Andrew Lunn <andrew@...n.ch>,
"David S. Miller" <davem@...emloft.net>,
Alexandre Torgue <alexandre.torgue@...s.st.com>,
Jose Abreu <joabreu@...opsys.com>,
Eric Dumazet <edumazet@...gle.com>,
Jakub Kicinski <kuba@...nel.org>, Paolo Abeni <pabeni@...hat.com>,
Maxime Coquelin <mcoquelin.stm32@...il.com>,
Joao Pinto <jpinto@...opsys.com>, netdev@...r.kernel.org,
linux-stm32@...md-mailman.stormreply.com,
linux-arm-kernel@...ts.infradead.org, linux-kernel@...r.kernel.org,
xfr@...look.com
Subject: Re: [PATCH net-next v4 3/7] net: stmmac: refactor FPE verification
process
On Tue, Aug 20, 2024 at 05:38:31PM +0800, Furong Xu wrote:
> Drop driver defined stmmac_fpe_state, and switch to common
> ethtool_mm_verify_status for local TX verification status.
>
> Local side and remote side verification processes are completely
> independent. There is no reason at all to keep a local state and
> a remote state.
>
> Add a spinlock to avoid races among ISR, workqueue, link update
> and register configuration.
>
> Signed-off-by: Furong Xu <0x1207@...il.com>
> ---
> drivers/net/ethernet/stmicro/stmmac/stmmac.h | 21 +--
> .../net/ethernet/stmicro/stmmac/stmmac_main.c | 172 ++++++++++--------
> .../net/ethernet/stmicro/stmmac/stmmac_tc.c | 6 -
> 3 files changed, 102 insertions(+), 97 deletions(-)
>
> diff --git a/drivers/net/ethernet/stmicro/stmmac/stmmac.h b/drivers/net/ethernet/stmicro/stmmac/stmmac.h
> index 458d6b16ce21..407b59f2783f 100644
> --- a/drivers/net/ethernet/stmicro/stmmac/stmmac.h
> +++ b/drivers/net/ethernet/stmicro/stmmac/stmmac.h
> @@ -146,14 +146,6 @@ struct stmmac_channel {
> u32 index;
> };
>
> -/* FPE link state */
> -enum stmmac_fpe_state {
> - FPE_STATE_OFF = 0,
> - FPE_STATE_CAPABLE = 1,
> - FPE_STATE_ENTERING_ON = 2,
> - FPE_STATE_ON = 3,
> -};
> -
> /* FPE link-partner hand-shaking mPacket type */
> enum stmmac_mpacket_type {
> MPACKET_VERIFY = 0,
> @@ -166,11 +158,16 @@ enum stmmac_fpe_task_state_t {
> };
>
> struct stmmac_fpe_cfg {
> - bool enable; /* FPE enable */
> - bool hs_enable; /* FPE handshake enable */
> - enum stmmac_fpe_state lp_fpe_state; /* Link Partner FPE state */
> - enum stmmac_fpe_state lo_fpe_state; /* Local station FPE state */
> + /* Serialize access to MAC Merge state between ethtool requests
> + * and link state updates.
> + */
> + spinlock_t lock;
> +
> u32 fpe_csr; /* MAC_FPE_CTRL_STS reg cache */
> + u32 verify_time; /* see ethtool_mm_state */
> + bool pmac_enabled; /* see ethtool_mm_state */
> + bool verify_enabled; /* see ethtool_mm_state */
> + enum ethtool_mm_verify_status status;
> };
>
> struct stmmac_tc_entry {
> diff --git a/drivers/net/ethernet/stmicro/stmmac/stmmac_main.c b/drivers/net/ethernet/stmicro/stmmac/stmmac_main.c
> index 3072ad33b105..6ae95f20b24f 100644
> --- a/drivers/net/ethernet/stmicro/stmmac/stmmac_main.c
> +++ b/drivers/net/ethernet/stmicro/stmmac/stmmac_main.c
> @@ -969,17 +969,21 @@ static void stmmac_mac_config(struct phylink_config *config, unsigned int mode,
> static void stmmac_fpe_link_state_handle(struct stmmac_priv *priv, bool is_up)
> {
> struct stmmac_fpe_cfg *fpe_cfg = &priv->fpe_cfg;
> - enum stmmac_fpe_state *lo_state = &fpe_cfg->lo_fpe_state;
> - enum stmmac_fpe_state *lp_state = &fpe_cfg->lp_fpe_state;
> - bool *hs_enable = &fpe_cfg->hs_enable;
> + unsigned long flags;
> +
> + spin_lock_irqsave(&priv->fpe_cfg.lock, flags);
> +
> + if (!fpe_cfg->pmac_enabled)
> + goto __unlock_out;
>
> - if (is_up && *hs_enable) {
> + if (is_up && fpe_cfg->verify_enabled)
> stmmac_fpe_send_mpacket(priv, priv->ioaddr, fpe_cfg,
> MPACKET_VERIFY);
> - } else {
> - *lo_state = FPE_STATE_OFF;
> - *lp_state = FPE_STATE_OFF;
> - }
> + else
> + fpe_cfg->status = ETHTOOL_MM_VERIFY_STATUS_DISABLED;
The fpe_task may be scheduled here. When you unlock, it may run and
overwrite the fpe_cfg->status you've just set.
Although I don't actually recommend setting ETHTOOL_MM_VERIFY_STATUS_DISABLED
unless cfg->verify_enabled=false.
> +
> +__unlock_out:
> + spin_unlock_irqrestore(&priv->fpe_cfg.lock, flags);
> }
>
> static void stmmac_mac_link_down(struct phylink_config *config,
> @@ -4091,11 +4095,25 @@ static int stmmac_release(struct net_device *dev)
>
> stmmac_release_ptp(priv);
>
> - pm_runtime_put(priv->device);
> -
> - if (priv->dma_cap.fpesel)
> + if (priv->dma_cap.fpesel) {
> stmmac_fpe_stop_wq(priv);
>
> + /* stmmac_ethtool_ops.begin() guarantees that all ethtool
> + * requests to fail with EBUSY when !netif_running()
> + *
> + * Prepare some params here, then fpe_cfg can keep consistent
> + * with the register states after a SW reset by __stmmac_open().
> + */
> + priv->fpe_cfg.pmac_enabled = false;
> + priv->fpe_cfg.verify_enabled = false;
> + priv->fpe_cfg.status = ETHTOOL_MM_VERIFY_STATUS_DISABLED;
> +
> + /* Reset MAC_FPE_CTRL_STS reg cache */
> + priv->fpe_cfg.fpe_csr = 0;
> + }
With this block of code, you're saying that you're deliberately okay for
the ethtool-mm state to be lost after a stmmac_release() call. Mind you,
some of the call sites of this are:
- stmmac_change_mtu()
- stmmac_reinit_queues()
- stmmac_reinit_ringparam()
I disagree that it's okay to lose the state configured by user space.
Instead, you should reprogram the saved state once lost.
Note that because stmmac_release() calls phylink_stop(), I think that
restoring the state in stmmac_fpe_link_state_handle() is enough. Because
there will always be a link drop.
> +
> + pm_runtime_put(priv->device);
> +
> return 0;
> }
>
> @@ -5979,44 +5997,34 @@ static int stmmac_set_features(struct net_device *netdev,
> static void stmmac_fpe_event_status(struct stmmac_priv *priv, int status)
> {
> struct stmmac_fpe_cfg *fpe_cfg = &priv->fpe_cfg;
> - enum stmmac_fpe_state *lo_state = &fpe_cfg->lo_fpe_state;
> - enum stmmac_fpe_state *lp_state = &fpe_cfg->lp_fpe_state;
> - bool *hs_enable = &fpe_cfg->hs_enable;
>
> - if (status == FPE_EVENT_UNKNOWN || !*hs_enable)
> - return;
> + spin_lock(&priv->fpe_cfg.lock);
>
> - /* If LP has sent verify mPacket, LP is FPE capable */
> - if ((status & FPE_EVENT_RVER) == FPE_EVENT_RVER) {
> - if (*lp_state < FPE_STATE_CAPABLE)
> - *lp_state = FPE_STATE_CAPABLE;
> + if (!fpe_cfg->pmac_enabled || status == FPE_EVENT_UNKNOWN)
> + goto __unlock_out;
>
> - /* If user has requested FPE enable, quickly response */
> - if (*hs_enable)
> - stmmac_fpe_send_mpacket(priv, priv->ioaddr,
> - fpe_cfg,
> - MPACKET_RESPONSE);
> - }
> + /* LP has sent verify mPacket */
> + if ((status & FPE_EVENT_RVER) == FPE_EVENT_RVER)
> + stmmac_fpe_send_mpacket(priv, priv->ioaddr, fpe_cfg,
> + MPACKET_RESPONSE);
>
> - /* If Local has sent verify mPacket, Local is FPE capable */
> - if ((status & FPE_EVENT_TVER) == FPE_EVENT_TVER) {
> - if (*lo_state < FPE_STATE_CAPABLE)
> - *lo_state = FPE_STATE_CAPABLE;
> - }
> + /* Local has sent verify mPacket */
> + if ((status & FPE_EVENT_TVER) == FPE_EVENT_TVER &&
> + fpe_cfg->status != ETHTOOL_MM_VERIFY_STATUS_SUCCEEDED)
> + fpe_cfg->status = ETHTOOL_MM_VERIFY_STATUS_VERIFYING;
>
> - /* If LP has sent response mPacket, LP is entering FPE ON */
> + /* LP has sent response mPacket */
> if ((status & FPE_EVENT_RRSP) == FPE_EVENT_RRSP)
> - *lp_state = FPE_STATE_ENTERING_ON;
> -
> - /* If Local has sent response mPacket, Local is entering FPE ON */
> - if ((status & FPE_EVENT_TRSP) == FPE_EVENT_TRSP)
> - *lo_state = FPE_STATE_ENTERING_ON;
> + fpe_cfg->status = ETHTOOL_MM_VERIFY_STATUS_SUCCEEDED;
>
> if (!test_bit(__FPE_REMOVING, &priv->fpe_task_state) &&
> !test_and_set_bit(__FPE_TASK_SCHED, &priv->fpe_task_state) &&
> priv->fpe_wq) {
> queue_work(priv->fpe_wq, &priv->fpe_task);
> }
> +
> +__unlock_out:
> + spin_unlock(&priv->fpe_cfg.lock);
> }
>
> static void stmmac_common_interrupt(struct stmmac_priv *priv)
> @@ -7372,50 +7380,57 @@ int stmmac_reinit_ringparam(struct net_device *dev, u32 rx_size, u32 tx_size)
> return ret;
> }
>
> -#define SEND_VERIFY_MPAKCET_FMT "Send Verify mPacket lo_state=%d lp_state=%d\n"
> -static void stmmac_fpe_lp_task(struct work_struct *work)
> +static void stmmac_fpe_verify_task(struct work_struct *work)
> {
> struct stmmac_priv *priv = container_of(work, struct stmmac_priv,
> fpe_task);
> struct stmmac_fpe_cfg *fpe_cfg = &priv->fpe_cfg;
> - enum stmmac_fpe_state *lo_state = &fpe_cfg->lo_fpe_state;
> - enum stmmac_fpe_state *lp_state = &fpe_cfg->lp_fpe_state;
> - bool *hs_enable = &fpe_cfg->hs_enable;
> - bool *enable = &fpe_cfg->enable;
> - int retries = 20;
> -
> - while (retries-- > 0) {
> - /* Bail out immediately if FPE handshake is OFF */
> - if (*lo_state == FPE_STATE_OFF || !*hs_enable)
> + int verify_limit = 3; /* defined by 802.3 */
> + unsigned long flags;
> + u32 sleep_ms;
> +
> + spin_lock(&priv->fpe_cfg.lock);
> + sleep_ms = fpe_cfg->verify_time;
> + spin_unlock(&priv->fpe_cfg.lock);
> +
> + while (1) {
> + /* The initial VERIFY was triggered by linkup event or
> + * stmmac_set_mm(), sleep then check MM_VERIFY_STATUS.
> + */
> + msleep(sleep_ms);
Thanks for the added comment. But why don't you just use queue_delayed_work()
instead of queue_work() and sleeping at the very beginning?
With this, you really don't need to drop the lock and read fpe_cfg->verify_time
twice.
But I think what is needed here is better suited for a timer, especially
because of the required coordination with the IRQ. See the end and the
attachment for more details.
> +
> + if (!netif_running(priv->dev))
> break;
>
> - if (*lo_state == FPE_STATE_ENTERING_ON &&
> - *lp_state == FPE_STATE_ENTERING_ON) {
> - stmmac_fpe_configure(priv, priv->ioaddr,
> - fpe_cfg,
> - priv->plat->tx_queues_to_use,
> - priv->plat->rx_queues_to_use,
> - *enable);
> + spin_lock_irqsave(&priv->fpe_cfg.lock, flags);
>
> - netdev_info(priv->dev, "configured FPE\n");
> + if (fpe_cfg->status == ETHTOOL_MM_VERIFY_STATUS_DISABLED ||
> + fpe_cfg->status == ETHTOOL_MM_VERIFY_STATUS_SUCCEEDED ||
> + !fpe_cfg->pmac_enabled || !fpe_cfg->verify_enabled) {
> + spin_unlock_irqrestore(&priv->fpe_cfg.lock, flags);
> + break;
> + }
>
> - *lo_state = FPE_STATE_ON;
> - *lp_state = FPE_STATE_ON;
> - netdev_info(priv->dev, "!!! BOTH FPE stations ON\n");
> + if (verify_limit == 0) {
> + fpe_cfg->verify_enabled = false;
I don't understand why turn off verify_enabled after failure? Only the
user should be able to modify this.
> + fpe_cfg->status = ETHTOOL_MM_VERIFY_STATUS_FAILED;
> + stmmac_fpe_configure(priv, priv->ioaddr, fpe_cfg,
> + priv->plat->tx_queues_to_use,
> + priv->plat->rx_queues_to_use,
> + false);
I don't understand why turn off tx_enabled after failure, rather than
not turning it on at all until success?
This really has me thinking. This hardware does not have the explicit
notion of the verification state - it is purely a driver construct.
So I wonder if the EFPE bit in MAC_FPE_CTRL_STS isn't actually what
corresponds to "tx_active" rather than "tx_enabled"?
(definitions at https://docs.kernel.org/networking/ethtool-netlink.html)
And "tx_enabled" would just correspond to a state variable in the driver,
which does nothing until verification is actually complete.
There is a test in manual_failed_verification() which checks the
correctness of the tx_enabled/tx_active behavior. If tx_enabled=true but
verification fails (and also _up until_ that point), the MM layer is
supposed to send packets through the eMAC (because tx_active=false).
But for your driver, that test is inconclusive, because you don't report
ethtool stats broken down by eMAC/pMAC, just aggregate. So we don't know
unless we take a closer look manually at the driver in that state.
> + spin_unlock_irqrestore(&priv->fpe_cfg.lock, flags);
> break;
> }
>
> - if ((*lo_state == FPE_STATE_CAPABLE ||
> - *lo_state == FPE_STATE_ENTERING_ON) &&
> - *lp_state != FPE_STATE_ON) {
> - netdev_info(priv->dev, SEND_VERIFY_MPAKCET_FMT,
> - *lo_state, *lp_state);
> - stmmac_fpe_send_mpacket(priv, priv->ioaddr,
> - fpe_cfg,
> + if (fpe_cfg->status == ETHTOOL_MM_VERIFY_STATUS_VERIFYING)
> + stmmac_fpe_send_mpacket(priv, priv->ioaddr, fpe_cfg,
> MPACKET_VERIFY);
> - }
> - /* Sleep then retry */
> - msleep(500);
> +
> + sleep_ms = fpe_cfg->verify_time;
> +
> + spin_unlock_irqrestore(&priv->fpe_cfg.lock, flags);
> +
> + verify_limit--;
> }
I took the liberty of rewriting the fpe_task to a timer, and delete the
workqueue. Here is a completely untested patch, which at least is less
complex, has less code and is easier to understand. What do you think?
View attachment "0001-net-stmmac-replace-FPE-workqueue-with-timer.patch" of type "text/x-diff" (14772 bytes)
Powered by blists - more mailing lists