lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Date:   Thu, 17 Jan 2019 09:02:38 +0100
From:   Ard Biesheuvel <ard.biesheuvel@...aro.org>
To:     Benjamin Herrenschmidt <benh@...nel.crashing.org>
Cc:     "Koenig, Christian" <Christian.Koenig@....com>,
        Michael Ellerman <mpe@...erman.id.au>,
        Will Deacon <will.deacon@....com>,
        Michel Dänzer <michel@...nzer.net>,
        Linux Kernel Mailing List <linux-kernel@...r.kernel.org>,
        Carsten Haitzler <Carsten.Haitzler@....com>,
        David Airlie <airlied@...ux.ie>,
        dri-devel <dri-devel@...ts.freedesktop.org>,
        "Huang, Ray" <Ray.Huang@....com>,
        "Zhang, Jerry" <Jerry.Zhang@....com>,
        linux-arm-kernel <linux-arm-kernel@...ts.infradead.org>,
        Bernhard Rosenkränzer 
        <Bernhard.Rosenkranzer@...aro.org>
Subject: Re: [RFC PATCH] drm/ttm: force cached mappings for system RAM on ARM

On Thu, 17 Jan 2019 at 07:07, Benjamin Herrenschmidt
<benh@...nel.crashing.org> wrote:
>
> On Wed, 2019-01-16 at 08:47 +0100, Ard Biesheuvel wrote:
> > > As far as I know on x86 it doesn't, so when you have an un-cached page
> > > you can still access it with a snooping DMA read/write operation and
> > > don't cause trouble.
> > >
> >
> > I think it is the other way around. The question is, on an otherwise
> > cache coherent device, whether the NoSnoop attribute set by the GPU
> > propagates all the way to the bus so that it bypasses the caches.
>
> On powerpc it's ignored, all DMA accesses will be snooped. But that's
> fine regardless of whether the memory was mapped cachable or not, the
> snooper will simply not find anything if not. I *think* we only do
> cache inject if the line already exists in one of the caches.
>

Others should correct me if I am wrong, but arm64 SoCs often have L3
system caches, and I would expect inbound transactions with writeback
write-allocate (WBWA) attributes to allocate there.

> > On x86, we can tolerate if this is not the case, since uncached memory
> > accesses by the CPU snoop the caches as well.
> >
> > On other architectures, uncached accesses go straight to main memory,
> > so if the device wrote anything to the caches we won't see it.
>
> Well, on all powerpc implementations that I am aware of at least (dunno
> about ARM), they do, but we don't have a problem because I don't think
> the devices can/will write to the caches directly unless a
> corresponding line already exists (but I might be wrong, we need to
> double check all implementations which is tricky).
>
> I am not aware of any powerpc chip implementing NoSnoop.
>

Do you have any history on why this optimization is disabled for power
unless CONFIG_NOT_CACHE_COHERENT is set?

That also begs the question how any of this is supposed to work with
non-cache coherent DMA. The code appears to always assume cache
coherent, and provide non-cache coherent as an optimization if
dma_arch_can_wc_memory() returns true. So I wonder if that helper
should take a struct device pointer instead, and return true for
non-cache coherent devices.

> > So to use this optimization, you have to either be 100% sure that
> > NoSnoop is implemented correctly, or have a x86 CPU.
> >
> > > > The old hack of using non-cached mapping to avoid snoop cost in AGP and
> > > > others is just that ... an ugly and horrible hacks that should have
> > > > never eventuated, when the search for performance pushes HW people into
> > > > utter insanity :)
> > >
> > > Well I agree that un-cached system memory makes things much more
> > > complicated for a questionable gain.
> > >
> > > But fact is we now have to deal with the mess, so no point in
> > > complaining about it to much :)
> > >
> >
> > Indeed. I wonder if we should just disable it altogether unless CONFIG_X86=y
>
> The question is whether DMA from a device can instanciate cache lines
> in your system. This a system specific rather than architecture
> specific question I suspect...
>

The ARM architecture permits it, afaict, and write-allocate is a hint
so the implementation is free to ignore it, whether it is set or
cleared.

Powered by blists - more mailing lists