lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Date:	Wed, 8 Jan 2014 21:18:59 +0000
From:	Zoltan Kiss <zoltan.kiss@...rix.com>
To:	Ma JieYue <majieyue@...il.com>, <netdev@...r.kernel.org>,
	<xen-devel@...ts.xen.org>
CC:	Ma JieYue <jieyue.majy@...baba-inc.com>,
	Wang Yingbin <yingbin.wangyb@...baba-inc.com>,
	Fu Tienan <tienan.ftn@...baba-inc.com>,
	"Wei Liu" <wei.liu2@...rix.com>,
	Ian Campbell <ian.campbell@...rix.com>,
	"David Vrabel" <david.vrabel@...rix.com>
Subject: Re: [PATCH net] xen-netback: fix vif tx queue race in xenvif_rx_interrupt

Hi,

With Paul's recent flow control improvement I think this became invalid:

http://git.kernel.org/cgit/linux/kernel/git/davem/net-next.git/commit/?id=ca2f09f2b2c6c25047cfc545d057c4edfcfe561c

Zoli

On 08/01/14 19:24, Ma JieYue wrote:
> From: Ma JieYue <jieyue.majy@...baba-inc.com>
>
> There is a race when waking up or stopping xenvif tx queue, and it leads to
> unnecessary packet drop. The problem is that the rx ring still full when entering
> into xenvif_start_xmit. In fact, in xenvif_rx_interrupt, the netif_wake_queue
> may be called not just after the ring is not full any more, so the operation
> is not atomic. Here is part of the debug log when the race scenario happened:
>
> wake_queue: req_cons_peek 2679757 req_cons 2679586 req_prod 2679841
> stop_queue: req_cons_peek 2679837 req_cons 2679757 req_prod 2679841
> [tx_queue_stopped true]
> wake_queue: req_cons_peek 2679837 req_cons 2679757 req_prod 2679841
> [tx_queue_stopped false]
> drop packet: req_cons_peek 2679837 req_cons 2679757 req_prod 2679841
>
> The debug log was written, every time right after netif_wake_queue been called
> in xenvif_rx_interrupt, every time after netif_stop_queue been called in
> xenvif_start_xmit and every time packet drop happened in xenvif_start_xmit.
> As we can see, the second wake_queue appeared in the place it should not be, and
> we believed the ring had been checked before the stop_queue, but the actual
> wake_queue action didn't follow, and took place after the stop_queue, so that when
> entering into xenvif_start_xmit the ring was full but the queue was not stopped.
>
> The patch fixes the race by checking if tx queue stopped, before trying to
> wake it up in xenvif_rx_interrupt. It only wakes the queue when it is stopped,
> as well as it is not full and schedulable.
>
> Signed-off-by: Ma JieYue <jieyue.majy@...baba-inc.com>
> Signed-off-by: Wang Yingbin <yingbin.wangyb@...baba-inc.com>
> Signed-off-by: Fu Tienan <tienan.ftn@...baba-inc.com>
> Cc: Wei Liu <wei.liu2@...rix.com>
> Cc: Ian Campbell <ian.campbell@...rix.com>
> Cc: David Vrabel <david.vrabel@...rix.com>
> ---
>   drivers/net/xen-netback/interface.c | 2 +-
>   1 file changed, 1 insertion(+), 1 deletion(-)
>
> diff --git a/drivers/net/xen-netback/interface.c b/drivers/net/xen-netback/interface.c
> index fff8cdd..e099f62 100644
> --- a/drivers/net/xen-netback/interface.c
> +++ b/drivers/net/xen-netback/interface.c
> @@ -105,7 +105,7 @@ static irqreturn_t xenvif_rx_interrupt(int irq, void *dev_id)
>   {
>   	struct xenvif *vif = dev_id;
>
> -	if (xenvif_rx_schedulable(vif))
> +	if (netif_queue_stopped(vif->dev) && xenvif_rx_schedulable(vif))
>   		netif_wake_queue(vif->dev);
>
>   	return IRQ_HANDLED;
>

--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ