netdev - Re: [net-next v2] xen-netback: improve guest-receive-side flow control

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <20131209111227.GH32155@zion.uk.xensource.com>
Date:	Mon, 9 Dec 2013 11:12:27 +0000
From:	Wei Liu <wei.liu2@...rix.com>
To:	Paul Durrant <paul.durrant@...rix.com>
CC:	<xen-devel@...ts.xen.org>, <netdev@...r.kernel.org>,
	Wei Liu <wei.liu2@...rix.com>,
	Ian Campbell <ian.campbell@...rix.com>,
	David Vrabel <david.vrabel@...rix.com>,
	Annie Li <annie.li@...cle.com>,
	Konrad Rzeszutek Wilk <konrad.wilk@...cle.com>
Subject: Re: [net-next v2] xen-netback: improve guest-receive-side flow
 control

On Fri, Dec 06, 2013 at 04:36:07PM +0000, Paul Durrant wrote:
> The way that flow control works without this patch is that, in start_xmit()
> the code uses xenvif_count_skb_slots() to predict how many slots
> xenvif_gop_skb() will consume and then adds this to a 'req_cons_peek'
> counter which it then uses to determine if the shared ring has that amount
> of space available by checking whether 'req_prod' has passed that value.
> If the ring doesn't have space the tx queue is stopped.
> xenvif_gop_skb() will then consume slots and update 'req_cons' and issue
> responses, updating 'rsp_prod' as it goes. The frontend will consume those
> responses and post new requests, by updating req_prod. So, req_prod chases
> req_cons which chases rsp_prod, and can never exceed that value. Thus if
> xenvif_count_skb_slots() ever returns a number of slots greater than
> xenvif_gop_skb() uses, req_cons_peek will get to a value that req_prod cannot
> possibly achieve (since it's limited by the 'real' req_cons) and, if this
> happens enough times, req_cons_peek gets more than a ring size ahead of
> req_cons and the tx queue then remains stopped forever waiting for an
> unachievable amount of space to become available in the ring.
> 
> Having two routines trying to calculate the same value is always going to be
> fragile, so this patch does away with that. All we essentially need to do is
> make sure that we have 'enough stuff' on our internal queue without letting
> it build up uncontrollably. So start_xmit() makes a cheap optimistic check
> of how much space is needed for an skb and only turns the queue off if that
> is unachievable. net_rx_action() is the place where we could do with an
> accurate predicition but, since that has proven tricky to calculate, a cheap
> worse-case (but not too bad) estimate is all we really need since the only
> thing we *must* prevent is xenvif_gop_skb() consuming more slots than are
> available.
> 
> Without this patch I can trivially stall netback permanently by just doing
> a large guest to guest file copy between two Windows Server 2008R2 VMs on a
> single host.
> 
> Patch tested with frontends in:
> - Windows Server 2008R2
> - CentOS 6.0
> - Debian Squeeze
> - Debian Wheezy
> - SLES11
> 

Looks good to me. And given that it has been tested with several
frontends:

Acked-by: Wei Liu <wei.liu2@...rix.com>

Thanks
Wei.
--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html