[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <cc3a78b6-b126-226f-b41a-061716aacd15@arm.com>
Date: Fri, 31 Mar 2023 16:12:32 +0100
From: Robin Murphy <robin.murphy@....com>
To: Arnd Bergmann <arnd@...db.de>, Arnd Bergmann <arnd@...nel.org>,
linux-kernel@...r.kernel.org
Cc: Vineet Gupta <vgupta@...nel.org>,
Russell King <linux@...linux.org.uk>,
Neil Armstrong <neil.armstrong@...aro.org>,
Linus Walleij <linus.walleij@...aro.org>,
Catalin Marinas <catalin.marinas@....com>,
Will Deacon <will@...nel.org>, guoren <guoren@...nel.org>,
Brian Cain <bcain@...cinc.com>,
Geert Uytterhoeven <geert@...ux-m68k.org>,
Michal Simek <monstr@...str.eu>,
Thomas Bogendoerfer <tsbogend@...ha.franken.de>,
Dinh Nguyen <dinguyen@...nel.org>,
Stafford Horne <shorne@...il.com>,
Helge Deller <deller@....de>,
Michael Ellerman <mpe@...erman.id.au>,
Christophe Leroy <christophe.leroy@...roup.eu>,
Paul Walmsley <paul.walmsley@...ive.com>,
Palmer Dabbelt <palmer@...belt.com>,
Rich Felker <dalias@...c.org>,
John Paul Adrian Glaubitz <glaubitz@...sik.fu-berlin.de>,
"David S . Miller" <davem@...emloft.net>,
Max Filippov <jcmvbkbc@...il.com>,
Christoph Hellwig <hch@....de>,
"Lad, Prabhakar" <prabhakar.mahadev-lad.rj@...renesas.com>,
"Conor.Dooley" <conor.dooley@...rochip.com>,
linux-snps-arc@...ts.infradead.org,
linux-arm-kernel@...ts.infradead.org,
"linux-oxnas@...ups.io" <linux-oxnas@...ups.io>,
"linux-csky@...r.kernel.org" <linux-csky@...r.kernel.org>,
linux-hexagon@...r.kernel.org, linux-m68k@...ts.linux-m68k.org,
linux-mips@...r.kernel.org,
"linux-openrisc@...r.kernel.org" <linux-openrisc@...r.kernel.org>,
linux-parisc@...r.kernel.org, linuxppc-dev@...ts.ozlabs.org,
linux-riscv@...ts.infradead.org, linux-sh@...r.kernel.org,
sparclinux@...r.kernel.org, linux-xtensa@...ux-xtensa.org
Subject: Re: [PATCH 20/21] ARM: dma-mapping: split out arch_dma_mark_clean()
helper
On 31/03/2023 3:00 pm, Arnd Bergmann wrote:
> On Mon, Mar 27, 2023, at 14:48, Robin Murphy wrote:
>> On 2023-03-27 13:13, Arnd Bergmann wrote:
>>>
>>> [ HELP NEEDED: can anyone confirm that it is a correct assumption
>>> on arm that a cache-coherent device writing to a page always results
>>> in it being in a PG_dcache_clean state like on ia64, or can a device
>>> write directly into the dcache?]
>>
>> In AMBA at least, if a snooping write hits in a cache then the data is
>> most likely going to get routed directly into that cache. If it has
>> write-back write-allocate attributes it could also land in any cache
>> along its normal path to RAM; it wouldn't have to go all the way.
>>
>> Hence all the fun we have where treating a coherent device as
>> non-coherent can still be almost as broken as the other way round :)
>
> Ok, thanks for the information. I'm still not sure whether this can
> result in the situation where PG_dcache_clean is wrong though.
>
> Specifically, the question is whether a DMA to a coherent buffer
> can end up in a dirty L1 dcache of one core and require to write
> back the dcache before invalidating the icache for that page.
>
> On ia64, this is not the case, the optimization here is to
> only flush the icache after a coherent DMA into an executable
> user page, while Arm only does this for noncoherent DMA but not
> coherent DMA.
>
> From your explanation it sounds like this might happen,
> even though that would mean that "coherent" DMA is slightly
> less coherent than it is elsewhere.
>
> To be on the safe side, I'd have to pass a flag into
> arch_dma_mark_clean() about coherency, to let the arm
> implementation still require the extra dcache flush
> for coherent DMA, while ia64 can ignore that flag.
Coherent DMA on Arm is assumed to be inner-shareable, so a coherent DMA
write should be pretty much equivalent to a coherent write by another
CPU (or indeed the local CPU itself) - nothing says that it *couldn't*
dirty a line in a data cache above the level of unification, so in
general the assumption must be that, yes, if coherent DMA is writing
data intended to be executable, then it's going to want a Dcache clean
to PoU and an Icache invalidate to PoU before trying to execute it. By
comparison, a non-coherent DMA transfer will inherently have to
invalidate the Dcache all the way to PoC in its dma_unmap, thus cannot
leave dirty data above the PoU, so only the Icache maintenance is
required in the executable case.
(FWIW I believe the Armv8 IDC/DIC features can safely be considered
irrelevant to 32-bit kernels)
I don't know a great deal about IA-64, but it appears to be using its
PG_arch_1 flag in a subtly different manner to Arm, namely to optimise
out the *Icache* maintenance. So if anything, it seems IA-64 is the
weirdo here (who'd have guessed?) where DMA manages to be *more*
coherent than the CPUs themselves :)
This is all now making me think we need some careful consideration of
whether the benefits of consolidating code outweigh the confusion of
conflating multiple different meanings of "clean" together...
Thanks,
Robin.
Powered by blists - more mailing lists