[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <1351085403.6537.102.camel@edumazet-glaptop>
Date: Wed, 24 Oct 2012 15:30:03 +0200
From: Eric Dumazet <eric.dumazet@...il.com>
To: Ian Campbell <Ian.Campbell@...rix.com>
Cc: "netdev@...r.kernel.org" <netdev@...r.kernel.org>,
Eric Dumazet <edumazet@...gle.com>,
Konrad Rzeszutek Wilk <konrad.wilk@...cle.com>,
"xen-devel@...ts.xen.org" <xen-devel@...ts.xen.org>
Subject: Re: [PATCH] net: allow configuration of the size of page in
__netdev_alloc_frag
On Wed, 2012-10-24 at 14:16 +0100, Ian Campbell wrote:
> On Wed, 2012-10-24 at 13:28 +0100, Eric Dumazet wrote:
> > On Wed, 2012-10-24 at 12:42 +0100, Ian Campbell wrote:
> > > The commit 69b08f62e174 "net: use bigger pages in __netdev_alloc_frag"
> > > lead to 70%+ packet loss under Xen when transmitting from physical (as
> > > opposed to virtual) network devices.
> > >
> > > This is because under Xen pages which are contiguous in the physical
> > > address space may not be contiguous in the DMA space, in fact it is
> > > very likely that they are not. I think there are other architectures
> > > where this is true, although perhaps non quite so aggressive as to
> > > have this property at a per-order-0-page granularity.
> > >
> > > The real underlying bug here most likely lies in the swiotlb not
> > > correctly handling compound pages, and Konrad is investigating this.
> > > However even with the swiotlb issue fixed the current arrangement
> > > seems likely to result in a lot of bounce buffering which seems likely
> > > to more than offset any benefit from the use of larger pages.
> > >
> > > Therefore make NETDEV_FRAG_PAGE_MAX_ORDER configurable at runtime and
> > > use this to request order-0 frags under Xen. Also expose this setting
> > > via sysctl.
> > >
> > > Signed-off-by: Ian Campbell <ian.campbell@...rix.com>
> > > Cc: Eric Dumazet <edumazet@...gle.com>
> > > Cc: Konrad Rzeszutek Wilk <konrad.wilk@...cle.com>
> > > Cc: netdev@...r.kernel.org
> > > Cc: xen-devel@...ts.xen.org
> > > ---
> >
> > I understand your concern, but this seems a quick/dirty hack at this
> > moment. After setting the sysctl to 0, some tasks may still have some
> > order-3 pages in their cache.
>
> Right, the sysctl thing might be overkill, I just figured it was useful
> for debugging. When booting in a Xen VM the patch sets it to zero very
> early on, during setup_arch(), which is before any tasks even exist.
>
> > Your driver must already cope with skb->head being split on several
> > pages.
> >
> > So what fundamental difference exists with frags ?
>
> The issue here is with drivers for physical network devices when running
> under Xen not with the Xen paravirtualised network drivers (AKA
> netback/netfront).
>
> The problem is that pages which are contiguous in the physical address
> space may not be contiguous in the DMA address space. With order>0 pages
> this becomes a problem when you poke down the DMA address and length of
> a compound page into the hardware registers. The DMA address will be
> right for the head of the page but once the hardware steps off the end
> of that it'll get the wrong page.
>
> I don't think this non-contiguousness between physical and DMA addresses
> is specific to Xen, although it is more frequent under Xen than any real
> hardware platform. (Xen has often been a good canary for these sorts of
> issues which turn out later on to impact other arches too.)
>
> In theory this could be fixed in all the drivers for physical network
> devices, but that would be a lot of effort (and probably a fair bit of
> ugliness in the drivers) for a gain which was only relevant to Xen.
I still have concerns about skb->head that you dint really answered.
Why skb->head can be on order-1 or order-2 pages and this is working ?
It seems to me its a driver issue, for example
drivers/net/xen-netfront.c has assumptions that can be easily fixed.
--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Powered by blists - more mailing lists