lists.openwall.net | lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC | |
Open Source and information security mailing list archives
| ||
|
Message-ID: <1351085403.6537.102.camel@edumazet-glaptop> Date: Wed, 24 Oct 2012 15:30:03 +0200 From: Eric Dumazet <eric.dumazet@...il.com> To: Ian Campbell <Ian.Campbell@...rix.com> Cc: "netdev@...r.kernel.org" <netdev@...r.kernel.org>, Eric Dumazet <edumazet@...gle.com>, Konrad Rzeszutek Wilk <konrad.wilk@...cle.com>, "xen-devel@...ts.xen.org" <xen-devel@...ts.xen.org> Subject: Re: [PATCH] net: allow configuration of the size of page in __netdev_alloc_frag On Wed, 2012-10-24 at 14:16 +0100, Ian Campbell wrote: > On Wed, 2012-10-24 at 13:28 +0100, Eric Dumazet wrote: > > On Wed, 2012-10-24 at 12:42 +0100, Ian Campbell wrote: > > > The commit 69b08f62e174 "net: use bigger pages in __netdev_alloc_frag" > > > lead to 70%+ packet loss under Xen when transmitting from physical (as > > > opposed to virtual) network devices. > > > > > > This is because under Xen pages which are contiguous in the physical > > > address space may not be contiguous in the DMA space, in fact it is > > > very likely that they are not. I think there are other architectures > > > where this is true, although perhaps non quite so aggressive as to > > > have this property at a per-order-0-page granularity. > > > > > > The real underlying bug here most likely lies in the swiotlb not > > > correctly handling compound pages, and Konrad is investigating this. > > > However even with the swiotlb issue fixed the current arrangement > > > seems likely to result in a lot of bounce buffering which seems likely > > > to more than offset any benefit from the use of larger pages. > > > > > > Therefore make NETDEV_FRAG_PAGE_MAX_ORDER configurable at runtime and > > > use this to request order-0 frags under Xen. Also expose this setting > > > via sysctl. > > > > > > Signed-off-by: Ian Campbell <ian.campbell@...rix.com> > > > Cc: Eric Dumazet <edumazet@...gle.com> > > > Cc: Konrad Rzeszutek Wilk <konrad.wilk@...cle.com> > > > Cc: netdev@...r.kernel.org > > > Cc: xen-devel@...ts.xen.org > > > --- > > > > I understand your concern, but this seems a quick/dirty hack at this > > moment. After setting the sysctl to 0, some tasks may still have some > > order-3 pages in their cache. > > Right, the sysctl thing might be overkill, I just figured it was useful > for debugging. When booting in a Xen VM the patch sets it to zero very > early on, during setup_arch(), which is before any tasks even exist. > > > Your driver must already cope with skb->head being split on several > > pages. > > > > So what fundamental difference exists with frags ? > > The issue here is with drivers for physical network devices when running > under Xen not with the Xen paravirtualised network drivers (AKA > netback/netfront). > > The problem is that pages which are contiguous in the physical address > space may not be contiguous in the DMA address space. With order>0 pages > this becomes a problem when you poke down the DMA address and length of > a compound page into the hardware registers. The DMA address will be > right for the head of the page but once the hardware steps off the end > of that it'll get the wrong page. > > I don't think this non-contiguousness between physical and DMA addresses > is specific to Xen, although it is more frequent under Xen than any real > hardware platform. (Xen has often been a good canary for these sorts of > issues which turn out later on to impact other arches too.) > > In theory this could be fixed in all the drivers for physical network > devices, but that would be a lot of effort (and probably a fair bit of > ugliness in the drivers) for a gain which was only relevant to Xen. I still have concerns about skb->head that you dint really answered. Why skb->head can be on order-1 or order-2 pages and this is working ? It seems to me its a driver issue, for example drivers/net/xen-netfront.c has assumptions that can be easily fixed. -- To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to majordomo@...r.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Powered by blists - more mailing lists