[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20131209111227.GH32155@zion.uk.xensource.com>
Date: Mon, 9 Dec 2013 11:12:27 +0000
From: Wei Liu <wei.liu2@...rix.com>
To: Paul Durrant <paul.durrant@...rix.com>
CC: <xen-devel@...ts.xen.org>, <netdev@...r.kernel.org>,
Wei Liu <wei.liu2@...rix.com>,
Ian Campbell <ian.campbell@...rix.com>,
David Vrabel <david.vrabel@...rix.com>,
Annie Li <annie.li@...cle.com>,
Konrad Rzeszutek Wilk <konrad.wilk@...cle.com>
Subject: Re: [net-next v2] xen-netback: improve guest-receive-side flow
control
On Fri, Dec 06, 2013 at 04:36:07PM +0000, Paul Durrant wrote:
> The way that flow control works without this patch is that, in start_xmit()
> the code uses xenvif_count_skb_slots() to predict how many slots
> xenvif_gop_skb() will consume and then adds this to a 'req_cons_peek'
> counter which it then uses to determine if the shared ring has that amount
> of space available by checking whether 'req_prod' has passed that value.
> If the ring doesn't have space the tx queue is stopped.
> xenvif_gop_skb() will then consume slots and update 'req_cons' and issue
> responses, updating 'rsp_prod' as it goes. The frontend will consume those
> responses and post new requests, by updating req_prod. So, req_prod chases
> req_cons which chases rsp_prod, and can never exceed that value. Thus if
> xenvif_count_skb_slots() ever returns a number of slots greater than
> xenvif_gop_skb() uses, req_cons_peek will get to a value that req_prod cannot
> possibly achieve (since it's limited by the 'real' req_cons) and, if this
> happens enough times, req_cons_peek gets more than a ring size ahead of
> req_cons and the tx queue then remains stopped forever waiting for an
> unachievable amount of space to become available in the ring.
>
> Having two routines trying to calculate the same value is always going to be
> fragile, so this patch does away with that. All we essentially need to do is
> make sure that we have 'enough stuff' on our internal queue without letting
> it build up uncontrollably. So start_xmit() makes a cheap optimistic check
> of how much space is needed for an skb and only turns the queue off if that
> is unachievable. net_rx_action() is the place where we could do with an
> accurate predicition but, since that has proven tricky to calculate, a cheap
> worse-case (but not too bad) estimate is all we really need since the only
> thing we *must* prevent is xenvif_gop_skb() consuming more slots than are
> available.
>
> Without this patch I can trivially stall netback permanently by just doing
> a large guest to guest file copy between two Windows Server 2008R2 VMs on a
> single host.
>
> Patch tested with frontends in:
> - Windows Server 2008R2
> - CentOS 6.0
> - Debian Squeeze
> - Debian Wheezy
> - SLES11
>
Looks good to me. And given that it has been tested with several
frontends:
Acked-by: Wei Liu <wei.liu2@...rix.com>
Thanks
Wei.
--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Powered by blists - more mailing lists