linux-kernel - Re: [PATCH] bcm63xx_enet: fix poll callback.

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <54F5DCAF.9070102@redhat.com>
Date:	Tue, 03 Mar 2015 08:09:19 -0800
From:	Alexander Duyck <alexander.h.duyck@...hat.com>
To:	Eric Dumazet <eric.dumazet@...il.com>,
	Nicolas Schichan <nschichan@...ebox.fr>
CC:	"David S. Miller" <davem@...emloft.net>,
	Tobias Klauser <tklauser@...tanz.ch>,
	Felipe Balbi <balbi@...com>,
	Wilfried Klaebe <w-lkml@...enslange-mailadresse.de>,
	"Eric W. Biederman" <ebiederm@...ssion.com>,
	netdev@...r.kernel.org, linux-kernel@...r.kernel.org
Subject: Re: [PATCH] bcm63xx_enet: fix poll callback.


On 03/03/2015 05:42 AM, Eric Dumazet wrote:
> On Tue, 2015-03-03 at 05:29 -0800, Eric Dumazet wrote:
>> On Tue, 2015-03-03 at 12:45 +0100, Nicolas Schichan wrote:
>>> In case there was some tx buffer reclaimed and not enough rx packets
>>> to consume the whole budget, napi_complete would not be called and
>>> interrupts would be kept disabled, effectively resulting in the
>>> network core never to call the poll callback again and no rx/tx
>>> interrupts to be fired either.
>>>
>>> Fix that by only accounting the rx work done in the poll callback.
>>>
>>> Signed-off-by: Nicolas Schichan <nschichan@...ebox.fr>
>>> ---
>> This looks better, thanks.
>>
>> Acked-by: Eric Dumazet <edumazet@...gle.com>
>>
>> Note that the way bcm_enet_tx_reclaim() is written, it can livelock on
>> SMP hosts :
>>
>> CPU 1,2,3,...  keep queuing packets via ndo_start_xmit()
>>
>> CPU 0 is looping forever in bcm_enet_tx_reclaim() draining queue and
>> freeing skbs.
>>
>> To avoid that, I would take priv->tx_lock only once, or add a limit on
>> the number of skbs that can be drained per round.
> Something like this (untested) patch
>
> diff --git a/drivers/net/ethernet/broadcom/bcm63xx_enet.c b/drivers/net/ethernet/broadcom/bcm63xx_enet.c
> index 21206d33b638..9e8e83865e52 100644
> --- a/drivers/net/ethernet/broadcom/bcm63xx_enet.c
> +++ b/drivers/net/ethernet/broadcom/bcm63xx_enet.c
> @@ -429,29 +429,23 @@ static int bcm_enet_receive_queue(struct net_device *dev, int budget)
>    */
>   static int bcm_enet_tx_reclaim(struct net_device *dev, int force)
>   {
> -	struct bcm_enet_priv *priv;
> -	int released;
> +	struct bcm_enet_priv *priv = netdev_priv(dev);
> +	int released = 0;
>   
> -	priv = netdev_priv(dev);
> -	released = 0;
> +	spin_lock(&priv->tx_lock);
>   
>   	while (priv->tx_desc_count < priv->tx_ring_size) {
>   		struct bcm_enet_desc *desc;
>   		struct sk_buff *skb;
>   
> -		/* We run in a bh and fight against start_xmit, which
> -		 * is called with bh disabled  */
> -		spin_lock(&priv->tx_lock);
> -
>   		desc = &priv->tx_desc_cpu[priv->tx_dirty_desc];
>   
> -		if (!force && (desc->len_stat & DMADESC_OWNER_MASK)) {
> -			spin_unlock(&priv->tx_lock);
> +		if (!force && (desc->len_stat & DMADESC_OWNER_MASK))
>   			break;
> -		}
>   
>   		/* ensure other field of the descriptor were not read
> -		 * before we checked ownership */
> +		 * before we checked ownership
> +		 */
>   		rmb();
>   

This rmb() can probably be replaced with a dma_rmb() since it is just a 
coherent/coherent ordering you are concerned with based on the comment.

- Alex
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/