lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Message-Id: <20220323080820.137579-1-tomas.melin@vaisala.com>
Date:   Wed, 23 Mar 2022 10:08:20 +0200
From:   Tomas Melin <tomas.melin@...sala.com>
To:     claudiu.beznea@...rochip.com
Cc:     Nicolas.Ferre@...rochip.com, davem@...emloft.net,
        linux-kernel@...r.kernel.org, netdev@...r.kernel.org
Subject: Re: [PATCH v3] net: macb: restart tx after tx used bit read

Hi,

> From: <Claudiu.Beznea@...rochip.com>
> To: <Nicolas.Ferre@...rochip.com>, <davem@...emloft.net>
> Cc: <netdev@...r.kernel.org>, <linux-kernel@...r.kernel.org>,
> 	<Claudiu.Beznea@...rochip.com>
> Subject: [PATCH v3] net: macb: restart tx after tx used bit read
> Date: Mon, 17 Dec 2018 10:02:42 +0000	[thread overview]
> Message-ID: <1545040937-6583-1-git-send-email-claudiu.beznea@...rochip.com> (raw)
> 
> From: Claudiu Beznea <claudiu.beznea@...rochip.com>
> 
> On some platforms (currently detected only on SAMA5D4) TX might stuck
> even the pachets are still present in DMA memories and TX start was
> issued for them. This happens due to race condition between MACB driver
> updating next TX buffer descriptor to be used and IP reading the same
> descriptor. In such a case, the "TX USED BIT READ" interrupt is asserted.
> GEM/MACB user guide specifies that if a "TX USED BIT READ" interrupt
> is asserted TX must be restarted. Restart TX if used bit is read and
> packets are present in software TX queue. Packets are removed from software
> TX queue if TX was successful for them (see macb_tx_interrupt()).
> 
> Signed-off-by: Claudiu Beznea <claudiu.beznea@...rochip.com>

On Xilinx Zynq the above change can cause infinite interrupt loop leading 
to CPU stall. Seems timing/load needs to be appropriate for this to happen, and currently
with 1G ethernet this can be triggered normally within minutes when running stress tests
on the network interface.

The events leading up to the interrupt looping are similar as the issue described in the
commit message. However in our case, restarting TX does not help at all. Instead
the controller is stuck on the queue end descriptor generating endless TX_USED           
interrupts, never breaking out of interrupt routine.

Any chance you remember more details about in which situation restarting TX helped for
your use case? was tx_qbar at the end of frame or stopped in middle of frame?

thanks,
Tomas Melin


> ---
> 
> Changes in v3:
> - remove "inline" keyword
> 
> Changes in v2:
> - use "static inline" instead of "inline static" for macb_tx_restart()
> 
>  drivers/net/ethernet/cadence/macb_main.c | 21 ++++++++++++++++++++-
>  1 file changed, 20 insertions(+), 1 deletion(-)
> 
> diff --git a/drivers/net/ethernet/cadence/macb_main.c b/drivers/net/ethernet/cadence/macb_main.c
> index 1d86b4d5645a..f920230386ee 100644
> --- a/drivers/net/ethernet/cadence/macb_main.c
> +++ b/drivers/net/ethernet/cadence/macb_main.c
> @@ -61,7 +61,8 @@
>  #define MACB_TX_ERR_FLAGS	(MACB_BIT(ISR_TUND)			\
>  					| MACB_BIT(ISR_RLE)		\
>  					| MACB_BIT(TXERR))
> -#define MACB_TX_INT_FLAGS	(MACB_TX_ERR_FLAGS | MACB_BIT(TCOMP))
> +#define MACB_TX_INT_FLAGS	(MACB_TX_ERR_FLAGS | MACB_BIT(TCOMP)	\
> +					| MACB_BIT(TXUBR))
>  
>  /* Max length of transmit frame must be a multiple of 8 bytes */
>  #define MACB_TX_LEN_ALIGN	8
> @@ -1312,6 +1313,21 @@ static void macb_hresp_error_task(unsigned long data)
>  	netif_tx_start_all_queues(dev);
>  }
>  
> +static void macb_tx_restart(struct macb_queue *queue)
> +{
> +	unsigned int head = queue->tx_head;
> +	unsigned int tail = queue->tx_tail;
> +	struct macb *bp = queue->bp;
> +
> +	if (bp->caps & MACB_CAPS_ISR_CLEAR_ON_WRITE)
> +		queue_writel(queue, ISR, MACB_BIT(TXUBR));
> +
> +	if (head == tail)
> +		return;
> +
> +	macb_writel(bp, NCR, macb_readl(bp, NCR) | MACB_BIT(TSTART));
> +}
> +
>  static irqreturn_t macb_interrupt(int irq, void *dev_id)
>  {
>  	struct macb_queue *queue = dev_id;
> @@ -1369,6 +1385,9 @@ static irqreturn_t macb_interrupt(int irq, void *dev_id)
>  		if (status & MACB_BIT(TCOMP))
>  			macb_tx_interrupt(queue);
>  
> +		if (status & MACB_BIT(TXUBR))
> +			macb_tx_restart(queue);
> +
>  		/* Link change detection isn't possible with RMII, so we'll
>  		 * add that if/when we get our hands on a full-blown MII PHY.
>  		 */
> -- 
> 2.7.4
> 

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ