[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Message-ID: <CALdTtnvUzoPLmgghRHb+gNOkivi3H7rhAaL96gLhkwOyK-ycWA@mail.gmail.com>
Date: Tue, 23 Apr 2019 18:39:50 -0600
From: dann frazier <dann.frazier@...onical.com>
To: Robin Murphy <robin.murphy@....com>
Cc: Christoph Hellwig <hch@....de>,
Marek Szyprowski <m.szyprowski@...sung.com>,
iommu@...ts.linux-foundation.org, linux-kernel@...r.kernel.org,
linux-arm-kernel <linux-arm-kernel@...ts.infradead.org>,
Xinwei Kong <kong.kongxinwei@...ilicon.com>
Subject: Re: [RFC] arm64: swiotlb: cma_alloc error spew
On Tue, Apr 23, 2019 at 12:03 PM dann frazier
<dann.frazier@...onical.com> wrote:
>
> On Tue, Apr 23, 2019 at 5:32 AM Robin Murphy <robin.murphy@....com> wrote:
> >
> > On 17/04/2019 21:48, dann frazier wrote:
> > > hey,
> > > I'm seeing an issue on a couple of arm64 systems[*] where they spew
> > > ~10K "cma: cma_alloc: alloc failed" messages at boot. The errors are
> > > non-fatal, and bumping up cma to a large enough size (~128M) gets rid
> > > of them - but that seems suboptimal. Bisection shows that this started
> > > after commit fafadcd16595 ("swiotlb: don't dip into swiotlb pool for
> > > coherent allocations"). It looks like __dma_direct_alloc_pages()
> > > is opportunistically using CMA memory but falls back to non-CMA if CMA
> > > disabled or unavailable. I've demonstrated that this fallback is
> > > indeed returning a valid pointer. So perhaps the issue is really just
> > > the warning emission.
> >
> > The CMA area being full isn't necessarily an ignorable non-problem,
> > since it means you won't be able to allocate the kind of large buffers
> > for which CMA was intended. The question is, is it actually filling up
> > with allocations that deserve to be there, or is this the same as I've
> > seen on a log from a ThunderX2 system where it's getting exhausted by
> > thousands upon thousands of trivial single page allocations? If it's the
> > latter (CONFIG_CMA_DEBUG should help shed some light if necessary),
>
> Appears so. Here's a histogram of count/size w/ a cma= large enough to
> avoid failures:
>
> $ dmesg | grep "cma: cma_alloc(cma" | sed -r 's/.*count
> ([0-9]+)\,.*/\1/' | sort -n | uniq -c
> 2062 1
> 32 2
> 266 8
> 2 24
> 4 32
> 256 33
And IIUC, this is also a big culprit. The debugfs bitmap seems to show
that the alignment of each of these leaves 31 pages unused, which adds
up to 31MB!
-dann
> 7 64
> 2 128
> 2 1024
>
> -dann
>
> > then
> > that does lean towards spending a bit more effort on this idea:
> >
> > https://lore.kernel.org/lkml/20190327080821.GB20336@lst.de/
> >
> > Robin.
> >
> > > The following naive patch solves the problem for me - just silence the
> > > cma errors, since it looks like a soft error. But is there a better
> > > approach?
> > >
> > > [*] APM X-Gene & HiSilicon Hi1620 w/ SMMU disabled
> > >
> > > diff --git a/kernel/dma/direct.c b/kernel/dma/direct.c
> > > index 6310ad01f915b..0324aa606c173 100644
> > > --- a/kernel/dma/direct.c
> > > +++ b/kernel/dma/direct.c
> > > @@ -112,7 +112,7 @@ struct page *__dma_direct_alloc_pages(struct device *dev, size_t size,
> > > /* CMA can be used only in the context which permits sleeping */
> > > if (gfpflags_allow_blocking(gfp)) {
> > > page = dma_alloc_from_contiguous(dev, count, page_order,
> > > - gfp & __GFP_NOWARN);
> > > + true);
> > > if (page && !dma_coherent_ok(dev, page_to_phys(page), size)) {
> > > dma_release_from_contiguous(dev, page, count);
> > > page = NULL;
> > >
> > >
> > >
> > >
Powered by blists - more mailing lists