[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <9AAE0902D5BC7E449B7C8E4E778ABCD029DDEA@AMSPEX01CL01.citrite.net>
Date: Fri, 28 Mar 2014 09:30:27 +0000
From: Paul Durrant <Paul.Durrant@...rix.com>
To: Sander Eikelenboom <linux@...elenboom.it>
CC: "netdev@...r.kernel.org" <netdev@...r.kernel.org>,
Wei Liu <wei.liu2@...rix.com>,
Ian Campbell <Ian.Campbell@...rix.com>,
"xen-devel@...ts.xen.org" <xen-devel@...ts.xen.org>
Subject: RE: [Xen-devel] [PATCH net v2 1/3] xen-netback: remove pointless
clause from if statement
> -----Original Message-----
> From: Sander Eikelenboom [mailto:linux@...elenboom.it]
> Sent: 27 March 2014 19:23
> To: Paul Durrant
> Cc: netdev@...r.kernel.org; Wei Liu; Ian Campbell; xen-devel@...ts.xen.org
> Subject: Re: [Xen-devel] [PATCH net v2 1/3] xen-netback: remove pointless
> clause from if statement
>
>
> Thursday, March 27, 2014, 7:34:54 PM, you wrote:
>
> > <big snip>
>
> >>> >>
> >>> >> > So, it may be that the worse-case estimate is now too bad. In the
> case
> >>> >> where it's failing for you it would be nice to know what the estimate
> was
> >>>
> >>>
> >>> > Ok, so we cannot be too pessimistic. In that case I don't see there's a
> lot
> >>> > of
> >>> > choice but to stick with the existing DIV_ROUND_UP (i.e. don't assume
> >>> > start_new_rx_buffer() returns true every time) and just add the extra
> 1.
> >>>
> >>> Hrmm i don't like a "magic" 1 bonus slot, there must be some theoretical
> >>> backing.
>
> > I don't like it either, but theory suggested each frag should take no more
> > space than the original DIV_ROUND_UP and that proved to be wrong, but I
> cannot
> > figure out why.
>
> >>> And since the original problem always seemed to occur on a packet with
> a
> >>> single large frag, i'm wondering
> >>> if this 1 would actually be correct in other cases.
>
> > That's why I went for an extra 1 per frag... a pessimal slot packing i.e. 2
> > byte frag may span 2 slots, PAGE_SIZE + 2 bytes may span 3, etc. etc.
>
> > In what situation my a 2 byte frag span 2 slots ?
>
> > At least there must be a theoretical cap to the number of slots needed ..
> > - assuming and SKB can contain only 65535 bytes
> > - assuming a slot can take max PAGE_SIZE and frags are slit into PAGE_SIZE
> pieces ..
>
> > - it could only max contain 15 PAGE_SIZE slots if nicely aligned ..
> > - double it .. and at 30 we wouldn't still be near that 52 estimate and i don't
> know the ring size
> > but wasn't that 32 ? So if the ring get's fully drained we shouldn't stall
> there.
>
>
> >>> Well this is what i said earlier on .. it's hard to estimate upfront if
> >>> "start_new_rx_buffer()" will return true,
> >>> and how many times that is possible per frag .. and if that is possible for
> >>> only
> >>> 1 frag or for all frags.
> >>>
> >>> The problem is now replaced from packets with 1 large frag (for which it
> >>> didn't account properly leading to a too small estimate) .. to packets
> >>> with a large number of (smaller) frags .. leading to a too large over
> >>> estimation.
> >>>
> >>> So would there be a theoretical maximum how often that path could hit
> >>> based on a combination of sizes (total size of all frags, nr_frags, size per
> >>> frag)
> >>> ?
> >>> - if you hit "start_new_rx_buffer()" == true in the first frag .. could you
> >>> hit it
> >>> in a next frag ?
> >>> - could it be limited due to something like the packet_size / nr_frags /
> >>> page_size ?
> >>>
> >>> And what was wrong with the previous calculation ?
> >>> int max = DIV_ROUND_UP(vif->dev->mtu, PAGE_SIZE);
> >>> if (vif->can_sg || vif->gso_mask || vif->gso_prefix_mask)
> >>> max += MAX_SKB_FRAGS + 1; /* extra_info + frags */
> >>>
>
> >> This is not safe if frag size can be > PAGE_SIZE.
>
> > #define MAX_SKB_FRAGS (65536/PAGE_SIZE + 2)
>
> > So if one of the frags is > PAGE_SIZE ..
> > wouldn't that imply that we have nr_frags < MAX_SKB_FRAGS because we
> are limited by the total packet size ?
> > (so we would spare a slot since we have a frag less .. but spend one more
> because we have a frag that needs 2 slots ?)
>
> > (and that this should even be pessimistic since we didn't substract the
> header etc from the max total packet size ?)
>
>
> > So from what i said early, you could probably do the pessimistic estimate
> (that would help when packets have a small skb->data_len
> > (space occupied by frags)) so the estimate would be less then the old one
> based on MAX_SKB_FRAGS causing the packet to be processed earlier.
> > And CAP it using the old way since a packet should never be able to use
> more slots than that theoretical max_slots (which hopefully is less than
> > the ring size, so a packet can always be processed if the ring is finally
> emptied.
>
>
> >>> That perhaps also misses some theoretical backing, what if it would have
> >>> (MAX_SKB_FRAGS - 1) nr_frags, but larger ones that have to be split to
> >>> fit in a slot. Or is the total size of frags a skb can carry limited to
> >>> MAX_SKB_FRAGS / PAGE_SIZE ? .. than you would expect that
> >>> MAX_SKB_FRAGS is a upper limit.
> >>> (and you could do the new check maxed by MAX_SKB_FRAGS so it
> doesn't
> >>> get to a too large non reachable estimate).
> >>>
> >>> But as a side question .. the whole "get_next_rx_buffer()" path is
> needed
> >>> for when a frag could not fit in a slot
> >>> as a whole ?
>
>
> >> Perhaps it would be best to take the hit on copy_ops and just tightly pack,
> so
> >> we only start a new slot when the current one is completely full; then
> actual
> >> slots would simply be DIV_ROUND_UP(skb->len, PAGE_SIZE) (+ 1 for the
> extra if
> >> it's a GSO).
>
> > Don't know if and how much a performance penalty that would be.
>
> >> Paul
>
> Hmm since i now started to dig around a bit more ..
>
> The ring size seems to be determined by netfront and not by netback ?
> Couldn't this lead to problems when PAGE_SIZE dom0 != PAGE_SIZE domU
> (and potentially lead to a overrun and therefore problems on the HOST) ?
>
> And about the commit message from
> ca2f09f2b2c6c25047cfc545d057c4edfcfe561c ...
> Do i understand it correctly that you saw the original problem (stall on large
> file copy) only on a "Windows Server 2008R2", probably with PV drivers ?
>
Yes, with PV drivers as you say.
> I don't see why the original calculation wouldn't work, so what kind of
> packets (nr_frags, frag size and PAGE_SIZE ) caused it ?
>
> And could you retest if that "Windows Server 2008R2" works with a netback
> with you latest patch series (pessimistic estimate) plus a cap on
> max_slots_needed like:
>
> if(max_slots_needed > MAX_SKB_FRAGS + 1){
> max_slots_needed = MAX_SKB_FRAGS + 1;
> }
The behaviour of the Windows frontend is different to netfront; it tries to keep the shared ring as full as possible so the estimate could be as pessimistic as you like (as long as it doesn't exceed ring size ;-)) and you'd never see the lock-up. For some reason (best known to the originator of the code I suspect) the Linux netfront driver limits the number of requests it posts into the shared ring leading to the possibility of lock-up in the case where the backend needs more slots than the fontend 'thinks' it should.
Paul
--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Powered by blists - more mailing lists