netdev - Re: [PATCH 2/3] xen-netback: fix unlimited guest Rx internal queue and carrier flapping

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <20141023114017.GG9188@zion.uk.xensource.com>
Date:	Thu, 23 Oct 2014 12:40:17 +0100
From:	Wei Liu <wei.liu2@...rix.com>
To:	David Vrabel <david.vrabel@...rix.com>
CC:	<netdev@...r.kernel.org>, <xen-devel@...ts.xenproject.org>,
	Ian Campbell <ian.campbell@...rix.com>,
	Wei Liu <wei.liu2@...rix.com>
Subject: Re: [PATCH 2/3] xen-netback: fix unlimited guest Rx internal queue
 and carrier flapping

On Wed, Oct 22, 2014 at 02:08:54PM +0100, David Vrabel wrote:
> Netback needs to discard old to-guest skb's (guest Rx queue drain) and
> it needs detect guest Rx stalls (to disable the carrier so packets are
> discarded earlier), but the current implementation is very broken.
> 
> 1. The check in hard_start_xmit of the slot availability did not
>    consider the number of packets that were already in the guest Rx
>    queue.  This could allow the queue to grow without bound.
> 
>    The guest stops consuming packets and the ring was allowed to fill
>    leaving S slot free.  Netback queues a packet requiring more than S
>    slots (ensuring that the ring stays with S slots free).  Netback
>    queue indefinately packets provided that then require S or fewer
>    slots.
> 
> 2. The Rx stall detection is not triggered in this case since the
>    (host) Tx queue is not stopped.
> 
> 3. If the Tx queue is stopped and a guest Rx interrupt occurs, netback
>    will consider this an Rx purge event which may result in it taking
>    the carrier down unnecessarily.  It also considers a queue with
>    only 1 slot free as unstalled (even though the next packet might
>    not fit in this).
> 
> The internal guest Rx queue is limited by a byte length (to 512 Kib,
> enough for half the ring).  The (host) Tx queue is stopped and started
> based on this limit.  This sets an upper bound on the amount of memory
> used by packets on the internal queue.
> 
> This allows the estimatation of the number of slots for an skb to be
> removed (it wasn't a very good estimate anyway).  Instead, the guest

+1 for this.

> Rx thread just waits for enough free slots for a maximum sized packet.
> 
> skbs queued on the internal queue have an 'expires' time (set to the
> current time plus the drain timeout).  The guest Rx thread will detect
> when the skb at the head of the queue has expired and discard expired
> skbs.  This sets a clear upper bound on the length of time an skb can
> be queued for.  For a guest being destroyed the maximum time needed to
> wait for all the packets it sent to be dropped is still the drain
> timeout (10 s) since it will not be sending new packets.
> 
> Rx stall detection is reintroduced in a later commit.
> 
> Signed-off-by: David Vrabel <david.vrabel@...rix.com>

Reviewed-by: Wei Liu <wei.liu2@...rix.com>
--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html