lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20231128133702.GA9917@lst.de>
Date:   Tue, 28 Nov 2023 14:37:02 +0100
From:   Christoph Hellwig <hch@....de>
To:     Alan Stern <stern@...land.harvard.edu>
Cc:     Christoph Hellwig <hch@....de>,
        Hamza Mahfooz <someguy@...ective-light.com>,
        Dan Williams <dan.j.williams@...el.com>,
        Marek Szyprowski <m.szyprowski@...sung.com>,
        Andrew <travneff@...il.com>,
        Ferry Toth <ferry.toth@...inga.info>,
        Andy Shevchenko <andy.shevchenko@...il.com>,
        Thorsten Leemhuis <regressions@...mhuis.info>,
        iommu@...ts.linux.dev,
        Kernel development list <linux-kernel@...r.kernel.org>,
        USB mailing list <linux-usb@...r.kernel.org>,
        Catalin Marinas <catalin.marinas@....com>,
        Robin Murphy <robin.murphy@....com>,
        Will Deacon <will@...nel.org>
Subject: Re: Bug in add_dma_entry()'s debugging code

On Mon, Nov 27, 2023 at 11:51:21AM -0500, Alan Stern wrote:
> The buffers in the bug report were allocated by kmalloc().  Doesn't 
> kmalloc() guarantee that on architectures with non-cache-coherent DMA, 
> allocations will be aligned on cache-line boundaries (or larger)?  Isn't 
> this what ARCH_DMA_MINALIGN and ARCH_KMALLOC_MINALIGN are supposed to 
> take care of in include/linux/slab.h?

Oh.  Yes, the variable alignment from kmalloc make things complicated.

> 	Architectures must ensure that kmalloc'ed buffer is
> 	DMA-safe. Drivers and subsystems depend on it. If an architecture
> 	isn't fully DMA-coherent (i.e. hardware doesn't ensure that data in
> 	the CPU cache is identical to data in main memory),
> 	ARCH_DMA_MINALIGN must be set so that the memory allocator
> 	makes sure that kmalloc'ed buffer doesn't share a cache line with
> 	the others. See arch/arm/include/asm/cache.h as an example.
> 
> It says nothing about avoiding more than one DMA operation at a time to 
> prevent overlap.  Is the documentation wrong?

The documentation is a bit lacking unfortunately.  Again, assuming
you actually have non-coherent mappings you'd easily break the fragile
cache line ownership if you did.  That doesn't apply to x86 as-is, but
we really try to keep drivers portable.

> >  This might not have an
> > effect on the particular plaform you are currently running on, but it
> > is still wrong.
> 
> Who decides what is right and what is wrong in this area?  Clearly you 
> have a different opinion from David S. Miller, Richard Henderson, and 
> Jakub Jelinek (the authors of that documentation file).

I don't think this about anyone's opinion. It's a fundamental propery of
how to manage caches in the face of non-coherent DMA.

> 
> >  Note that there have been various mumblings about
> > using nosnoop dma on x86, in which case you'll have the issue there
> > as well.
> 
> Unless the people who implement nosnoop DMA also modify kmalloc() or 
> ARCH_DMA_MINALIGN.

Or don't do it on kmalloc buffers.

> I guess the real question boils down to where the responsiblity should 
> lie.  Should kmalloc() guarantee that the memory regions it provides 
> will always be suitable for DMA, with no overlap issues?  Or should all 
> the innumerable users of kmalloc() be responsible for jumping through 
> hoops to request arch-specific minimum alignment for their DMA buffers?

I'd actually go one step back:

 1) for not cache coherent DMA you can't do overlapping operations inside
    a cache line
 2) dma-debug is intended to find DMA API misuses, even if they don't
    have bad effects on your particular system
 3) the fact that the kmalloc implementation returns differently aligned
    memory depending on the platform breaks how dma-debug works currently

The logical confcusion from that would be that IFF dma-debug is enabled on
any platform we need to set ARCH_DMA_MINALIGN to the cache line size.

BUT:  we're actually reduzing our dependency on ARCH_DMA_MINALIGN by
moving to bounce buffering unaligned memory for non-coherent
architectures, which makes this even more complicated.  Right now I
don't have a good idea how to actually deal with having the cachline
sanity checks with that, but I'm Ccing some of the usual suspects if
they have a better idea.

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ