lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Date:   Mon, 06 Jul 2020 16:09:36 +0200
From:   Nicolas Saenz Julienne <nsaenzjulienne@...e.de>
To:     David Rientjes <rientjes@...gle.com>,
        Robin Murphy <robin.murphy@....com>
Cc:     Jeremy Linton <jeremy.linton@....com>,
        "linux-arm-kernel@...ts.infradead.org" 
        <linux-arm-kernel@...ts.infradead.org>, linux-mm@...ck.org,
        "linux-usb@...r.kernel.org" <linux-usb@...r.kernel.org>,
        Christoph Hellwig <hch@....de>,
        "linux-kernel@...r.kernel.org" <linux-kernel@...r.kernel.org>,
        linux-rpi-kernel <linux-rpi-kernel@...ts.infradead.org>
Subject: Re: [BUG] XHCI getting ZONE_DMA32 memory > than its bus_dma_limit

On Sun, 2020-07-05 at 16:41 -0700, David Rientjes wrote:
> On Fri, 3 Jul 2020, Robin Murphy wrote:
> 
> > > Just for the record the offending commit is: c84dc6e68a1d2 ("dma-pool: add
> > > additional coherent pools to map to gfp mask").
> > > 
> > > On Thu, 2020-07-02 at 12:49 -0500, Jeremy Linton wrote:
> > > > Hi,
> > > > 
> > > > Using 5.8rc3:
> > > > 
> > > > The rpi4 has a 3G dev->bus_dma_limit on its XHCI controller. With a usb3
> > > > hub, plus a few devices plugged in, randomly devices will fail
> > > > operations. This appears to because xhci_alloc_container_ctx() is
> > > > getting buffers > 3G via dma_pool_zalloc().
> > > > 
> > > > Tracking that down, it seems to be caused by dma_alloc_from_pool() using
> > > > dev_to_pool()->dma_direct_optimal_gfp_mask() to "optimistically" select
> > > > the atomic_pool_dma32 but then failing to verify that the allocations in
> > > > the pool are less than the dev bus_dma_limit.
> > > 
> > > I can reproduce this too.
> > > 
> > > The way I see it, dev_to_pool() wants a strict
> > > dma_direct_optimal_gfp_mask()
> > > that is never wrong, since it's going to stick to that pool for the
> > > device's
> > > lifetime. I've been looking at how to implement it, and it's not so
> > > trivial
> > > as
> > > I can't see a failproof way to make a distinction between who needs DMA32
> > > and
> > > who is OK with plain KERNEL memory.
> > > 
> > > Otherwise, as Jeremy points out, the patch needs to implement allocations
> > > with
> > > an algorithm similar to __dma_direct_alloc_pages()'s, which TBH I don't
> > > know
> > > if
> > > it's a little overkill for the atomic context.
> > > 
> > > Short of finding a fix in the coming rc's, I suggest we revert this.
> > 
> > Or perhaps just get rid of atomic_pool_dma32 (and allocate atomic_pool_dma
> > from ZONE_DMA32 if !ZONE_DMA). That should make it fall pretty much back in
> > line while still preserving the potential benefit of the kernel pool for
> > non-address-constrained devices.
> > 
> 
> I assume it depends on how often we have devices where 
> __dma_direct_alloc_pages() behavior is required, i.e. what requires the 
> dma_coherent_ok() checks and altering of the gfp flags to get memory that 
> works.
> 
> Is the idea that getting rid of atomic_pool_dma32 would use GFP_KERNEL 
> (and atomic_pool_kernel) as the default policy here?  That doesn't do any 
> dma_coherent_ok() checks so dma_direct_alloc_pages would return from 
> ZONE_NORMAL without a < 3G check?

IIUC this is not what Robin proposes.

The idea is to only have one DMA pool, located in ZONE_DMA, if enabled, and
ZONE_DMA32 otherwise. This way you're always sure the memory is going to be
good enough for any device while maintaining the benefits of
atomic_pool_kernel.

> It *seems* like we want to check if dma_coherent_ok() succeeds for ret in 
> dma_direct_alloc_pages() when allocating from the atomic pool and, based 
> on criteria that allows fallback, just fall into 
> __dma_direct_alloc_pages()?

I suspect I don't have enough perspective here but, isn't that breaking the
point of having an atomic pool? Wouldn't that generate big latency spikes? I
can see how audio transfers over USB could be affected by this specifically,
IIRC those are allocated atomically and have timing constraints.

That said, if Robin solution works for you, I don't mind having a go at it.

Regards,
Nicolas


Download attachment "signature.asc" of type "application/pgp-signature" (489 bytes)

Powered by blists - more mailing lists