linux-kernel - Re: [REGRESSION] Recent swiotlb DMA_FROM

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <CAHk-=wgLyqNJx=bb8=o0Nk5U8gMnf0-=qx53ShLEb3V=Yrt8fw@mail.gmail.com>
Date:   Sun, 27 Mar 2022 12:23:34 -0700
From:   Linus Torvalds <torvalds@...ux-foundation.org>
To:     David Laight <David.Laight@...lab.com>
Cc:     Halil Pasic <pasic@...ux.ibm.com>,
        Toke Høiland-Jørgensen <toke@...e.dk>,
        Robin Murphy <robin.murphy@....com>,
        Christoph Hellwig <hch@....de>,
        Maxime Bizon <mbizon@...ebox.fr>,
        Oleksandr Natalenko <oleksandr@...alenko.name>,
        Marek Szyprowski <m.szyprowski@...sung.com>,
        Kalle Valo <kvalo@...nel.org>,
        "David S. Miller" <davem@...emloft.net>,
        Jakub Kicinski <kuba@...nel.org>,
        Paolo Abeni <pabeni@...hat.com>,
        Olha Cherevyk <olha.cherevyk@...il.com>,
        iommu <iommu@...ts.linux-foundation.org>,
        linux-wireless <linux-wireless@...r.kernel.org>,
        Netdev <netdev@...r.kernel.org>,
        Linux Kernel Mailing List <linux-kernel@...r.kernel.org>,
        Greg Kroah-Hartman <gregkh@...uxfoundation.org>,
        stable <stable@...r.kernel.org>
Subject: Re: [REGRESSION] Recent swiotlb DMA_FROM_DEVICE fixes break
 ath9k-based AP

On Sun, Mar 27, 2022 at 8:24 AM David Laight <David.Laight@...lab.com> wrote:
>
> Aren't bounce buffers just a more extreme case on non-coherent
> memory accesses?

No.

In fact, this whoe change came about exactly because bounce buffers
are different.

The difference is that bounce buffers have that (wait for it) bounce
buffer, which can have stale contents.

> They just need explicit memory copies rather than just cache
> writeback and invalidate operations.

That's the thing - the memory copies introduce entirely new issues.

I really think that instead of making up abstract rules ("ownership"
or "bounce buffers are just extreme cases of non-coherency") we should
make the rules very much practical and down to earth, and write out
exactly what they *do*.

The whole "sync DMA" is odd and abstract enough as a concept on its
own, we shouldn't then make the rules for it odd and abstract. We
should make them very very practical.

So I will propose that we really make it very much about practical
concerns, and we document things as

 (a) the "sync" operation has by definition a "whose _future_ access
do we sync for" notion.

     So "dma_sync_single_for_cpu()" says that the future accesses to
this dma area is for the CPU.

     Note how it does *NOT* say that the "CPU owns the are". That's
bullsh*t, and we now know it's BS.

 (b) but the sync operation also has a "who wrote the data we're syncing"

     Note that this is *not* "who accessed or owned it last", because
that's nonsensical: if we're syncing for the CPU, then the only reason
to do so is because we expect that the last access was by the device,
so specifying that separately would just be redundant and stupid.

     But specifying who *wrote* to the area is meaningful and matters.
It matters for the non-coherent cache case (because of writeback
issues), but it also matters for the bounce buffer case (becasue it
determines which way we should copy).

Note how this makes sense: a "sync" operation is clearly about taking
some past state, and making it palatable for a future use. The past
state is pretty much defined by who wrote the data, and then we can
use that and the "the next thing to access it" to determine what we
need to do about the sync.

It is *NOT* about "ownership".

So let's go through the cases, and I'm going to ignore the "CPU caches
are coherent with device DMA" case because that's always going to be a
no-op wrt data movement (but it will still generally need a memory
barrier, which I will mention here and then ignore going forward).

Syncing for *CPU* accesses (ie dma_sync_single_for_cpu()) has four
choices I can see:

 - nobody wrote the data at all (aka DMA_NONE).

   This is nonsensical and should warn. If nobody wrote to it, why
would the CPU ever validly access it?

   Maybe you should have written "memset(buffer, 0, size)" instead?

 - the CPU wrote the data in the first place (aka DMA_TO_DEVICE)

   This is a no-op (possibly a memory barrier), because even stale CPU
caches are fine, and even if it was in a bounce buffer, the original
CPU-side data is fine.

 - the device wrote the data (aka DMA_FROM_DEVICE)

   This is just the regular case of a device having written something,
and the CPU wants to see it.

   It obviously needs real work, but it's simple and straightforward.

   For non-coherent caches, it needs a cache invalidate. For a bounce
buffer, it needs a copy from the bounce buffer to the "real" buffer.

 - it's not clear who write the data (aka DMA_BIDIRECTIONAL)

   This is not really doable for a bounce buffer - we just don't know
which buffer contents are valid.

   I think it's very very questionable for non-coherent caches too,
but "writeback and invalidate" probably can't hurt.

   So probably warn about it, and do whatever we used to do historically.

Syncing for device accesses (ie dma_sync_single_for_device()) also has
the same four choices I can see, but obviously does different things:

 - nobody wrote the data at all (aka DMA_NONE)

   This sounds as nonsensical as the CPU case, but maybe isn't.

   We may not have any previous explicit writes, but we *do* have that
"insecure and possibly stale buffer contents" bounce buffer thing on
the device side.

   So with a bounce buffer, it's actually somewhat sane to say
"initialize the bounce buffer to a known state".

   Because bounce buffers *are* special. Unlike even the "noncoherent
caches" issue, they have that entirely *other* hidden state in the
form of the bounce buffer itself.

   Discuss.

 - the CPU wrote the data in the first place (aka DMA_TO_DEVICE).

   This is the regular and common case of "we have data on the CPU
side that is written to the device".

   Again, needs work, but is simple and straightforward.

   For non-coherent caches, we need a writeback on the CPU. For a
bounce buffer, we need to copy from the regular buffer to the bounce
buffer.

 - the device wrote the data in the first place (aka DMA_FROM_DEVICE)

   This is the case that we hit on ath9k. It's *not* obvious, but when
we write this out this way, I really think the semantics are pretty
clear.

   For non-coherent caches, we may need an "invalidate". For a bounce
buffer, it's a no-op (because the bounce buffer already contains the
data)

 - it's not clear who write the data (aka DMA_BIDIRECTIONAL)

   This is again not really doable for a bounce buffer. We don't know
which buffer contains the right data, we should warn about it and do
whatever we used to do historically.

   Again, it's very questionable for non-coherent caches too, but
"writeback and invalidate" probably at least can't hurt.

So hey, that's my thinking. The whole "ownership"  model is, I think,
obviously untenable.

But just going through and listing the different cases and making them
explicit I think explains exactly what the different situations are,
and that then makes it fairly clear what the different combinations
should do.

No?

           Linus