netdev - Re: [PATCH v2] ethernet:arc: Fix racing of TX ring buffer

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <20160528064348.GA19649@debian-dorm>
Date:	Sat, 28 May 2016 14:43:48 +0800
From:	Shuyu Wei <wsy2220@...il.com>
To:	Lino Sanfilippo <LinoSanfilippo@....de>
Cc:	Francois Romieu <romieu@...zoreil.com>,
	David Miller <davem@...emloft.net>, wxt@...k-chips.com,
	heiko@...ech.de, linux-rockchip@...ts.infradead.org,
	netdev@...r.kernel.org, al.kochet@...il.com
Subject: Re: [PATCH v2] ethernet:arc: Fix racing of TX ring buffer

On Wed, May 25, 2016 at 01:56:20AM +0200, Lino Sanfilippo wrote:
> Francois, Shuyu,
> 
> this is the patch with the discussed changes.
> 
> Shuyu it would be great if you could test this one. If it passes
> and there are no further objections I will resend it as a regular patch
> (including commit message, etc.) to the mailing list.
> 
> 
> diff --git a/drivers/net/ethernet/arc/emac_main.c b/drivers/net/ethernet/arc/emac_main.c
> index a3a9392..ec656b3 100644
> --- a/drivers/net/ethernet/arc/emac_main.c
> +++ b/drivers/net/ethernet/arc/emac_main.c
> @@ -153,18 +153,29 @@ static void arc_emac_tx_clean(struct net_device *ndev)
>  {
>  	struct arc_emac_priv *priv = netdev_priv(ndev);
>  	struct net_device_stats *stats = &ndev->stats;
> +	unsigned int curr = priv->txbd_curr;
>  	unsigned int i;
>  
> +	/* Make sure buffers and txbd_curr are consistent */
> +	smp_rmb();
> +
>  	for (i = 0; i < TX_BD_NUM; i++) {
>  		unsigned int *txbd_dirty = &priv->txbd_dirty;
>  		struct arc_emac_bd *txbd = &priv->txbd[*txbd_dirty];
>  		struct buffer_state *tx_buff = &priv->tx_buff[*txbd_dirty];
> -		struct sk_buff *skb = tx_buff->skb;
> -		unsigned int info = le32_to_cpu(txbd->info);
> +		unsigned int info;
> +		struct sk_buff *skb;
>  
> -		if ((info & FOR_EMAC) || !txbd->data || !skb)
> +		if (*txbd_dirty == curr)
>  			break;
>  
> +		info = le32_to_cpu(txbd->info);
> +
> +		if (info & FOR_EMAC)
> +			break;
> +
> +		skb = tx_buff->skb;
> +
>  		if (unlikely(info & (DROP | DEFR | LTCL | UFLO))) {
>  			stats->tx_errors++;
>  			stats->tx_dropped++;
> @@ -195,8 +206,8 @@ static void arc_emac_tx_clean(struct net_device *ndev)
>  		*txbd_dirty = (*txbd_dirty + 1) % TX_BD_NUM;
>  	}
>  
> -	/* Ensure that txbd_dirty is visible to tx() before checking
> -	 * for queue stopped.
> +	/* Ensure that txbd_dirty is visible to tx() and we see the most recent
> +	 * value for txbd_curr.
>  	 */
>  	smp_mb();
>  
> @@ -680,27 +691,24 @@ static int arc_emac_tx(struct sk_buff *skb, struct net_device *ndev)
>  	dma_unmap_len_set(&priv->tx_buff[*txbd_curr], len, len);
>  
>  	priv->txbd[*txbd_curr].data = cpu_to_le32(addr);
> -
> -	/* Make sure pointer to data buffer is set */
> -	wmb();
> +	priv->tx_buff[*txbd_curr].skb = skb;
>  
>  	skb_tx_timestamp(skb);
>  
>  	*info = cpu_to_le32(FOR_EMAC | FIRST_OR_LAST_MASK | len);
>  
> -	/* Make sure info word is set */
> +	/* 1. Make sure that with respect to tx_clean everything is set up
> +	 * properly before we advance txbd_curr.
> +	 * 2. Make sure writes to DMA descriptors are completed before we inform
> +	 * the hardware.
> +	 */
>  	wmb();
>  
> -	priv->tx_buff[*txbd_curr].skb = skb;
> -
>  	/* Increment index to point to the next BD */
>  	*txbd_curr = (*txbd_curr + 1) % TX_BD_NUM;
>  
> -	/* Ensure that tx_clean() sees the new txbd_curr before
> -	 * checking the queue status. This prevents an unneeded wake
> -	 * of the queue in tx_clean().
> -	 */
> -	smp_mb();
> +	/* Ensure tx_clean() sees the updated value of txbd_curr */
> +	smp_wmb();
>  
>  	if (!arc_emac_tx_avail(priv)) {
>  		netif_stop_queue(ndev);

After some stress testing, it worked well most of the time.
But there is a chance that it may get stuck when I use 2 nc process
to send TCP packects at full speed.  Only when a new rx packet 
arrive can trigger it to run again. This happens only once per several
hours. No problem in UDP mode.  I'm not sure if it's related to tx code in
the driver.