lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Date:	Fri, 28 Mar 2014 10:20:25 +0000
From:	Paul Durrant <Paul.Durrant@...rix.com>
To:	David Laight <David.Laight@...LAB.COM>,
	Sander Eikelenboom <linux@...elenboom.it>
CC:	"netdev@...r.kernel.org" <netdev@...r.kernel.org>,
	Wei Liu <wei.liu2@...rix.com>,
	Ian Campbell <Ian.Campbell@...rix.com>,
	"xen-devel@...ts.xen.org" <xen-devel@...ts.xen.org>
Subject: RE: [Xen-devel] [PATCH net v2 1/3] xen-netback: remove pointless
 clause from if statement

> -----Original Message-----
> From: netdev-owner@...r.kernel.org [mailto:netdev-
> owner@...r.kernel.org] On Behalf Of David Laight
> Sent: 28 March 2014 10:01
> To: Paul Durrant; Sander Eikelenboom
> Cc: netdev@...r.kernel.org; Wei Liu; Ian Campbell; xen-devel@...ts.xen.org
> Subject: RE: [Xen-devel] [PATCH net v2 1/3] xen-netback: remove pointless
> clause from if statement
> 
> From: Paul Durrant
> ...
> > > The behaviour of the Windows frontend is different to netfront; it tries to
> > > keep the shared ring as full as possible so the estimate could be as
> > > pessimistic as you like (as long as it doesn't exceed ring size ;-)) and you'd
> > > never see the lock-up. For some reason (best known to the originator of
> the
> > > code I suspect) the Linux netfront driver limits the number of requests it
> > > posts into the shared ring leading to the possibility of lock-up in the case
> > > where the backend needs more slots than the fontend 'thinks' it should.
> > > But from what i read the ring size is determined by the frontend .. so that
> PV
> > > driver should be able to guarantee that itself ..
> > >
> >
> > The ring size is 256 - that's baked in. The number of pending requests
> available to backend *is*
> > determined by the frontend.
> >
> > > Which begs for the question .. was that change of max_slots_needed
> > > calculation *needed* to prevent the problem you saw on "Windows
> Server
> > > 2008R2",
> > > or was that just changed for correctness ?
> > >
> >
> > It was changed for correctness. As I understand it, use of MAX_SKB_FRAGS
> is incorrect if compound
> > pages are in use as the page size is no longer the slot size. It's also wasteful
> to always wait for
> > space for a maximal packet if the packet you have is smaller so the
> intention of the max estimate was
> > that it should be at least the number of slots required but not excessive. I
> think you've proved that
> > making such an estimate is just too hard and since we don't want to fall
> back to the old dry-run style
> > of slot counting (which meant you had two codepaths that *must* arrive at
> the same number - and they
> > didn't, which is why I was getting the lock-up with Windows guests) I think
> we should just go with
> > full-packing so that we don't need to estimate.
> 
> A reasonable high estimate for the number of slots required for a specific
> message is 'frag_count + total_size/4096'.
> So if that are that many slots free it is definitely ok to add the message.
> 

Hmm, that may work. By total_size, I assume you mean skb->len, so that calculation is based on an overhead of 1 non-optimally packed slot per frag. There'd still need to be a +1 for the GSO 'extra' though.

> I can see a more general problem for transmits.
> I believe a NAPI driver is supposed to indicate that it can't accept
> a tx packet in advance of being given a specific packet to transmit.
> This means it has to keep enough tx ring space for a worst case packet
> (which in some cases can be larger than 1+MAX_SKB_FRAGS) even though
> such a packet is unlikely.
> I would be tempted to save the skb that 'doesn't fit' within the driver
> rather than try to second guess the number of fragments the next packet
> will need.
> 

Well, we avoid that by having an internal queue and then only stopping the tx queue if the skb we were just handed will definitely not fit. TBH though, I think this internal queue is problematic though as we require a context switch to get the skbs into the shared ring and I think the extra latency caused by this is hitting performance. If we do get rid of it then we do need to worry about the size of a maximal skb again.

  Paul

> FWIW the USB3 'bulk' driver has the same problem, fragments can't cross
> 64k boundaries.
> 
> 	David
> 
> 
> 
> 
> 
> --
> To unsubscribe from this list: send the line "unsubscribe netdev" in
> the body of a message to majordomo@...r.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ