lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20240820123456.qbt4emjdjg5pouym@skbuf>
Date: Tue, 20 Aug 2024 15:34:56 +0300
From: Vladimir Oltean <olteanv@...il.com>
To: Furong Xu <0x1207@...il.com>
Cc: Serge Semin <fancer.lancer@...il.com>, Andrew Lunn <andrew@...n.ch>,
	"David S. Miller" <davem@...emloft.net>,
	Alexandre Torgue <alexandre.torgue@...s.st.com>,
	Jose Abreu <joabreu@...opsys.com>,
	Eric Dumazet <edumazet@...gle.com>,
	Jakub Kicinski <kuba@...nel.org>, Paolo Abeni <pabeni@...hat.com>,
	Maxime Coquelin <mcoquelin.stm32@...il.com>,
	Joao Pinto <jpinto@...opsys.com>, netdev@...r.kernel.org,
	linux-stm32@...md-mailman.stormreply.com,
	linux-arm-kernel@...ts.infradead.org, linux-kernel@...r.kernel.org,
	xfr@...look.com
Subject: Re: [PATCH net-next v4 3/7] net: stmmac: refactor FPE verification
 process

On Tue, Aug 20, 2024 at 05:38:31PM +0800, Furong Xu wrote:
> Drop driver defined stmmac_fpe_state, and switch to common
> ethtool_mm_verify_status for local TX verification status.
> 
> Local side and remote side verification processes are completely
> independent. There is no reason at all to keep a local state and
> a remote state.
> 
> Add a spinlock to avoid races among ISR, workqueue, link update
> and register configuration.
> 
> Signed-off-by: Furong Xu <0x1207@...il.com>
> ---
>  drivers/net/ethernet/stmicro/stmmac/stmmac.h  |  21 +--
>  .../net/ethernet/stmicro/stmmac/stmmac_main.c | 172 ++++++++++--------
>  .../net/ethernet/stmicro/stmmac/stmmac_tc.c   |   6 -
>  3 files changed, 102 insertions(+), 97 deletions(-)
> 
> diff --git a/drivers/net/ethernet/stmicro/stmmac/stmmac.h b/drivers/net/ethernet/stmicro/stmmac/stmmac.h
> index 458d6b16ce21..407b59f2783f 100644
> --- a/drivers/net/ethernet/stmicro/stmmac/stmmac.h
> +++ b/drivers/net/ethernet/stmicro/stmmac/stmmac.h
> @@ -146,14 +146,6 @@ struct stmmac_channel {
>  	u32 index;
>  };
>  
> -/* FPE link state */
> -enum stmmac_fpe_state {
> -	FPE_STATE_OFF = 0,
> -	FPE_STATE_CAPABLE = 1,
> -	FPE_STATE_ENTERING_ON = 2,
> -	FPE_STATE_ON = 3,
> -};
> -
>  /* FPE link-partner hand-shaking mPacket type */
>  enum stmmac_mpacket_type {
>  	MPACKET_VERIFY = 0,
> @@ -166,11 +158,16 @@ enum stmmac_fpe_task_state_t {
>  };
>  
>  struct stmmac_fpe_cfg {
> -	bool enable;				/* FPE enable */
> -	bool hs_enable;				/* FPE handshake enable */
> -	enum stmmac_fpe_state lp_fpe_state;	/* Link Partner FPE state */
> -	enum stmmac_fpe_state lo_fpe_state;	/* Local station FPE state */
> +	/* Serialize access to MAC Merge state between ethtool requests
> +	 * and link state updates.
> +	 */
> +	spinlock_t lock;
> +
>  	u32 fpe_csr;				/* MAC_FPE_CTRL_STS reg cache */
> +	u32 verify_time;			/* see ethtool_mm_state */
> +	bool pmac_enabled;			/* see ethtool_mm_state */
> +	bool verify_enabled;			/* see ethtool_mm_state */
> +	enum ethtool_mm_verify_status status;
>  };
>  
>  struct stmmac_tc_entry {
> diff --git a/drivers/net/ethernet/stmicro/stmmac/stmmac_main.c b/drivers/net/ethernet/stmicro/stmmac/stmmac_main.c
> index 3072ad33b105..6ae95f20b24f 100644
> --- a/drivers/net/ethernet/stmicro/stmmac/stmmac_main.c
> +++ b/drivers/net/ethernet/stmicro/stmmac/stmmac_main.c
> @@ -969,17 +969,21 @@ static void stmmac_mac_config(struct phylink_config *config, unsigned int mode,
>  static void stmmac_fpe_link_state_handle(struct stmmac_priv *priv, bool is_up)
>  {
>  	struct stmmac_fpe_cfg *fpe_cfg = &priv->fpe_cfg;
> -	enum stmmac_fpe_state *lo_state = &fpe_cfg->lo_fpe_state;
> -	enum stmmac_fpe_state *lp_state = &fpe_cfg->lp_fpe_state;
> -	bool *hs_enable = &fpe_cfg->hs_enable;
> +	unsigned long flags;
> +
> +	spin_lock_irqsave(&priv->fpe_cfg.lock, flags);
> +
> +	if (!fpe_cfg->pmac_enabled)
> +		goto __unlock_out;
>  
> -	if (is_up && *hs_enable) {
> +	if (is_up && fpe_cfg->verify_enabled)
>  		stmmac_fpe_send_mpacket(priv, priv->ioaddr, fpe_cfg,
>  					MPACKET_VERIFY);
> -	} else {
> -		*lo_state = FPE_STATE_OFF;
> -		*lp_state = FPE_STATE_OFF;
> -	}
> +	else
> +		fpe_cfg->status = ETHTOOL_MM_VERIFY_STATUS_DISABLED;

The fpe_task may be scheduled here. When you unlock, it may run and
overwrite the fpe_cfg->status you've just set.

Although I don't actually recommend setting ETHTOOL_MM_VERIFY_STATUS_DISABLED
unless cfg->verify_enabled=false.

> +
> +__unlock_out:
> +	spin_unlock_irqrestore(&priv->fpe_cfg.lock, flags);
>  }
>  
>  static void stmmac_mac_link_down(struct phylink_config *config,
> @@ -4091,11 +4095,25 @@ static int stmmac_release(struct net_device *dev)
>  
>  	stmmac_release_ptp(priv);
>  
> -	pm_runtime_put(priv->device);
> -
> -	if (priv->dma_cap.fpesel)
> +	if (priv->dma_cap.fpesel) {
>  		stmmac_fpe_stop_wq(priv);
>  
> +		/* stmmac_ethtool_ops.begin() guarantees that all ethtool
> +		 * requests to fail with EBUSY when !netif_running()
> +		 *
> +		 * Prepare some params here, then fpe_cfg can keep consistent
> +		 * with the register states after a SW reset by __stmmac_open().
> +		 */
> +		priv->fpe_cfg.pmac_enabled = false;
> +		priv->fpe_cfg.verify_enabled = false;
> +		priv->fpe_cfg.status = ETHTOOL_MM_VERIFY_STATUS_DISABLED;
> +
> +		/* Reset MAC_FPE_CTRL_STS reg cache */
> +		priv->fpe_cfg.fpe_csr = 0;
> +	}

With this block of code, you're saying that you're deliberately okay for
the ethtool-mm state to be lost after a stmmac_release() call. Mind you,
some of the call sites of this are:
- stmmac_change_mtu()
- stmmac_reinit_queues()
- stmmac_reinit_ringparam()

I disagree that it's okay to lose the state configured by user space.
Instead, you should reprogram the saved state once lost.

Note that because stmmac_release() calls phylink_stop(), I think that
restoring the state in stmmac_fpe_link_state_handle() is enough. Because
there will always be a link drop.

> +
> +	pm_runtime_put(priv->device);
> +
>  	return 0;
>  }
>  
> @@ -5979,44 +5997,34 @@ static int stmmac_set_features(struct net_device *netdev,
>  static void stmmac_fpe_event_status(struct stmmac_priv *priv, int status)
>  {
>  	struct stmmac_fpe_cfg *fpe_cfg = &priv->fpe_cfg;
> -	enum stmmac_fpe_state *lo_state = &fpe_cfg->lo_fpe_state;
> -	enum stmmac_fpe_state *lp_state = &fpe_cfg->lp_fpe_state;
> -	bool *hs_enable = &fpe_cfg->hs_enable;
>  
> -	if (status == FPE_EVENT_UNKNOWN || !*hs_enable)
> -		return;
> +	spin_lock(&priv->fpe_cfg.lock);
>  
> -	/* If LP has sent verify mPacket, LP is FPE capable */
> -	if ((status & FPE_EVENT_RVER) == FPE_EVENT_RVER) {
> -		if (*lp_state < FPE_STATE_CAPABLE)
> -			*lp_state = FPE_STATE_CAPABLE;
> +	if (!fpe_cfg->pmac_enabled || status == FPE_EVENT_UNKNOWN)
> +		goto __unlock_out;
>  
> -		/* If user has requested FPE enable, quickly response */
> -		if (*hs_enable)
> -			stmmac_fpe_send_mpacket(priv, priv->ioaddr,
> -						fpe_cfg,
> -						MPACKET_RESPONSE);
> -	}
> +	/* LP has sent verify mPacket */
> +	if ((status & FPE_EVENT_RVER) == FPE_EVENT_RVER)
> +		stmmac_fpe_send_mpacket(priv, priv->ioaddr, fpe_cfg,
> +					MPACKET_RESPONSE);
>  
> -	/* If Local has sent verify mPacket, Local is FPE capable */
> -	if ((status & FPE_EVENT_TVER) == FPE_EVENT_TVER) {
> -		if (*lo_state < FPE_STATE_CAPABLE)
> -			*lo_state = FPE_STATE_CAPABLE;
> -	}
> +	/* Local has sent verify mPacket */
> +	if ((status & FPE_EVENT_TVER) == FPE_EVENT_TVER &&
> +	    fpe_cfg->status != ETHTOOL_MM_VERIFY_STATUS_SUCCEEDED)
> +		fpe_cfg->status = ETHTOOL_MM_VERIFY_STATUS_VERIFYING;
>  
> -	/* If LP has sent response mPacket, LP is entering FPE ON */
> +	/* LP has sent response mPacket */
>  	if ((status & FPE_EVENT_RRSP) == FPE_EVENT_RRSP)
> -		*lp_state = FPE_STATE_ENTERING_ON;
> -
> -	/* If Local has sent response mPacket, Local is entering FPE ON */
> -	if ((status & FPE_EVENT_TRSP) == FPE_EVENT_TRSP)
> -		*lo_state = FPE_STATE_ENTERING_ON;
> +		fpe_cfg->status = ETHTOOL_MM_VERIFY_STATUS_SUCCEEDED;
>  
>  	if (!test_bit(__FPE_REMOVING, &priv->fpe_task_state) &&
>  	    !test_and_set_bit(__FPE_TASK_SCHED, &priv->fpe_task_state) &&
>  	    priv->fpe_wq) {
>  		queue_work(priv->fpe_wq, &priv->fpe_task);
>  	}
> +
> +__unlock_out:
> +	spin_unlock(&priv->fpe_cfg.lock);
>  }
>  
>  static void stmmac_common_interrupt(struct stmmac_priv *priv)
> @@ -7372,50 +7380,57 @@ int stmmac_reinit_ringparam(struct net_device *dev, u32 rx_size, u32 tx_size)
>  	return ret;
>  }
>  
> -#define SEND_VERIFY_MPAKCET_FMT "Send Verify mPacket lo_state=%d lp_state=%d\n"
> -static void stmmac_fpe_lp_task(struct work_struct *work)
> +static void stmmac_fpe_verify_task(struct work_struct *work)
>  {
>  	struct stmmac_priv *priv = container_of(work, struct stmmac_priv,
>  						fpe_task);
>  	struct stmmac_fpe_cfg *fpe_cfg = &priv->fpe_cfg;
> -	enum stmmac_fpe_state *lo_state = &fpe_cfg->lo_fpe_state;
> -	enum stmmac_fpe_state *lp_state = &fpe_cfg->lp_fpe_state;
> -	bool *hs_enable = &fpe_cfg->hs_enable;
> -	bool *enable = &fpe_cfg->enable;
> -	int retries = 20;
> -
> -	while (retries-- > 0) {
> -		/* Bail out immediately if FPE handshake is OFF */
> -		if (*lo_state == FPE_STATE_OFF || !*hs_enable)
> +	int verify_limit = 3; /* defined by 802.3 */
> +	unsigned long flags;
> +	u32 sleep_ms;
> +
> +	spin_lock(&priv->fpe_cfg.lock);
> +	sleep_ms = fpe_cfg->verify_time;
> +	spin_unlock(&priv->fpe_cfg.lock);
> +
> +	while (1) {
> +		/* The initial VERIFY was triggered by linkup event or
> +		 * stmmac_set_mm(), sleep then check MM_VERIFY_STATUS.
> +		 */
> +		msleep(sleep_ms);

Thanks for the added comment. But why don't you just use queue_delayed_work()
instead of queue_work() and sleeping at the very beginning?

With this, you really don't need to drop the lock and read fpe_cfg->verify_time
twice.

But I think what is needed here is better suited for a timer, especially
because of the required coordination with the IRQ. See the end and the
attachment for more details.

> +
> +		if (!netif_running(priv->dev))
>  			break;
>  
> -		if (*lo_state == FPE_STATE_ENTERING_ON &&
> -		    *lp_state == FPE_STATE_ENTERING_ON) {
> -			stmmac_fpe_configure(priv, priv->ioaddr,
> -					     fpe_cfg,
> -					     priv->plat->tx_queues_to_use,
> -					     priv->plat->rx_queues_to_use,
> -					     *enable);
> +		spin_lock_irqsave(&priv->fpe_cfg.lock, flags);
>  
> -			netdev_info(priv->dev, "configured FPE\n");
> +		if (fpe_cfg->status == ETHTOOL_MM_VERIFY_STATUS_DISABLED ||
> +		    fpe_cfg->status == ETHTOOL_MM_VERIFY_STATUS_SUCCEEDED ||
> +		    !fpe_cfg->pmac_enabled || !fpe_cfg->verify_enabled) {
> +			spin_unlock_irqrestore(&priv->fpe_cfg.lock, flags);
> +			break;
> +		}
>  
> -			*lo_state = FPE_STATE_ON;
> -			*lp_state = FPE_STATE_ON;
> -			netdev_info(priv->dev, "!!! BOTH FPE stations ON\n");
> +		if (verify_limit == 0) {
> +			fpe_cfg->verify_enabled = false;

I don't understand why turn off verify_enabled after failure? Only the
user should be able to modify this.

> +			fpe_cfg->status = ETHTOOL_MM_VERIFY_STATUS_FAILED;
> +			stmmac_fpe_configure(priv, priv->ioaddr, fpe_cfg,
> +					     priv->plat->tx_queues_to_use,
> +					     priv->plat->rx_queues_to_use,
> +					     false);

I don't understand why turn off tx_enabled after failure, rather than
not turning it on at all until success?

This really has me thinking. This hardware does not have the explicit
notion of the verification state - it is purely a driver construct.
So I wonder if the EFPE bit in MAC_FPE_CTRL_STS isn't actually what
corresponds to "tx_active" rather than "tx_enabled"?
(definitions at https://docs.kernel.org/networking/ethtool-netlink.html)

And "tx_enabled" would just correspond to a state variable in the driver,
which does nothing until verification is actually complete.

There is a test in manual_failed_verification() which checks the
correctness of the tx_enabled/tx_active behavior. If tx_enabled=true but
verification fails (and also _up until_ that point), the MM layer is
supposed to send packets through the eMAC (because tx_active=false).
But for your driver, that test is inconclusive, because you don't report
ethtool stats broken down by eMAC/pMAC, just aggregate. So we don't know
unless we take a closer look manually at the driver in that state.

> +			spin_unlock_irqrestore(&priv->fpe_cfg.lock, flags);
>  			break;
>  		}
>  
> -		if ((*lo_state == FPE_STATE_CAPABLE ||
> -		     *lo_state == FPE_STATE_ENTERING_ON) &&
> -		     *lp_state != FPE_STATE_ON) {
> -			netdev_info(priv->dev, SEND_VERIFY_MPAKCET_FMT,
> -				    *lo_state, *lp_state);
> -			stmmac_fpe_send_mpacket(priv, priv->ioaddr,
> -						fpe_cfg,
> +		if (fpe_cfg->status == ETHTOOL_MM_VERIFY_STATUS_VERIFYING)
> +			stmmac_fpe_send_mpacket(priv, priv->ioaddr, fpe_cfg,
>  						MPACKET_VERIFY);
> -		}
> -		/* Sleep then retry */
> -		msleep(500);
> +
> +		sleep_ms = fpe_cfg->verify_time;
> +
> +		spin_unlock_irqrestore(&priv->fpe_cfg.lock, flags);
> +
> +		verify_limit--;
>  	}

I took the liberty of rewriting the fpe_task to a timer, and delete the
workqueue. Here is a completely untested patch, which at least is less
complex, has less code and is easier to understand. What do you think?

View attachment "0001-net-stmmac-replace-FPE-workqueue-with-timer.patch" of type "text/x-diff" (14772 bytes)

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ