[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <alpine.DEB.2.02.1312011659500.3198@kaball.uk.xensource.com>
Date: Sun, 1 Dec 2013 17:06:00 +0000
From: Stefano Stabellini <stefano.stabellini@...citrix.com>
To: James Bottomley <James.Bottomley@...senPartnership.com>
CC: Konrad Rzeszutek Wilk <konrad.wilk@...cle.com>,
Ian Jackson <Ian.Jackson@...citrix.com>,
<netdev@...r.kernel.org>, Michael Chan <mchan@...adcom.com>,
<dl-mptfusionlinux@....com>, <linux-scsi@...r.kernel.org>,
<support@....com>, Sreekanth Reddy <Sreekanth.Reddy@....com>,
Nagalakshmi Nandigama <Nagalakshmi.Nandigama@....com>,
<xen-devel@...ts.xenproject.org>, <linux-kernel@...r.kernel.org>
Subject: Re: "swiotlb buffer is full" with 3.13-rc1+ but not 3.4.
On Sat, 30 Nov 2013, James Bottomley wrote:
> On Sat, 2013-11-30 at 13:56 -0500, Konrad Rzeszutek Wilk wrote:
> > My theory is that the SWIOTLB is not full - it is just that the request
> > is for a compound page that is more than 512kB. Please note that
> > SWIOTLB highest "chunk" of buffer it can deal with is 512kb.
> >
> > And that is of course the question comes out - why would it try to
> > bounce buffer it. In Xen the answer is simple - the sg chunks cross page
> > boundaries which means that they are not physically contingous - so we
> > have to use the bounce buffer. It would be better if the the sg list
> > provided a large list of 4KB pages instead of compound pages as that
> > could help in avoiding the bounce buffer.
> >
> > But I digress - this is a theory - I don't know whether the SCSI layer
> > does any colescing of the sg list - and if so, whether there is any
> > easy knob to tell it to not do it.
>
> Well, SCSI doesn't, but block does. It's actually an efficiency thing
> since most firmware descriptor formats cope with multiple pages and the
> more descriptors you have for a transaction, the more work the on-board
> processor on the HBA has to do. If you have an emulated HBA, like
> virtio, you could turn off physical coalesing by setting the
> use_clustering flag to DISABLE_CLUSTERING. But you can't do that for a
> real card. I assume the problem here is that the host is passing the
> card directly to the guest and the guest clusters based on its idea of
> guest pages which don't map to contiguous physical pages?
>
> The way you tell how many physically contiguous pages block is willing
> to merge is by looking at /sys/block/<dev>/queue/max_segment_size if
> that's 4k then it won't merge, if it's greater than 4k, then it will.
>
> I'm not quite sure what to do ... you can't turn of clustering globally
> in the guest because the virtio drivers use it to reduce ring descriptor
> pressure, what you probably want is some way to flag a pass through
> device.
Given that we don't use virtio on Xen, we could actually turn off
clustering globally (if we are running on Xen).
In fact for example BIOVEC_PHYS_MERGEABLE is defined:
+#define BIOVEC_PHYS_MERGEABLE(vec1, vec2) \
+ (__BIOVEC_PHYS_MERGEABLE(vec1, vec2) && \
+ (!xen_domain() || xen_biovec_phys_mergeable(vec1, vec2)))
so that we can disable it if the two bv_page are not actually physical
contiguous.
--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Powered by blists - more mailing lists