[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20220324190216.0efa067f.pasic@linux.ibm.com>
Date: Thu, 24 Mar 2022 19:02:16 +0100
From: Halil Pasic <pasic@...ux.ibm.com>
To: Robin Murphy <robin.murphy@....com>
Cc: Linus Torvalds <torvalds@...ux-foundation.org>,
Oleksandr Natalenko <oleksandr@...alenko.name>,
Christoph Hellwig <hch@....de>,
Marek Szyprowski <m.szyprowski@...sung.com>,
Toke Høiland-Jørgensen
<toke@...e.dk>, Kalle Valo <kvalo@...nel.org>,
"David S. Miller" <davem@...emloft.net>,
Jakub Kicinski <kuba@...nel.org>,
Paolo Abeni <pabeni@...hat.com>,
Olha Cherevyk <olha.cherevyk@...il.com>,
iommu <iommu@...ts.linux-foundation.org>,
linux-wireless <linux-wireless@...r.kernel.org>,
Netdev <netdev@...r.kernel.org>,
Linux Kernel Mailing List <linux-kernel@...r.kernel.org>,
Greg Kroah-Hartman <gregkh@...uxfoundation.org>,
stable <stable@...r.kernel.org>,
Halil Pasic <pasic@...ux.ibm.com>
Subject: Re: [REGRESSION] Recent swiotlb DMA_FROM_DEVICE fixes break
ath9k-based AP
On Wed, 23 Mar 2022 20:54:08 +0000
Robin Murphy <robin.murphy@....com> wrote:
> On 2022-03-23 19:16, Linus Torvalds wrote:
> > On Wed, Mar 23, 2022 at 12:06 PM Robin Murphy <robin.murphy@....com> wrote:
> >>
> >> On 2022-03-23 17:27, Linus Torvalds wrote:
> >>>
> >>> I'm assuming that the ath9k issue is that it gives DMA mapping a big
> >>> enough area to handle any possible packet size, and just expects -
> >>> quite reasonably - smaller packets to only fill the part they need.
> >>>
> >>> Which that "info leak" patch obviously breaks entirely.
> >>
> >> Except that's the exact case which the new patch is addressing
> >
> > Not "addressing". Breaking.
> >
> > Which is why it will almost certainly get reverted.
> >
> > Not doing DMA to the whole area seems to be quite the sane thing to do
> > for things like network packets, and overwriting the part that didn't
> > get DMA'd with zeroes seems to be exactly the wrong thing here.
> >
> > So the SG_IO - and other random untrusted block command sources - data
> > leak will almost certainly have to be addressed differently. Possibly
> > by simply allocating the area with GFP_ZERO to begin with.
>
> Er, the point of the block layer case is that whole area *is* zeroed to
> begin with, and a latent memory corruption problem in SWIOTLB itself
> replaces those zeros with random other kernel data unexpectedly. Let me
> try illustrating some sequences for clarity...
>
> Expected behaviour/without SWIOTLB:
> Memory
> ---------------------------------------------------
> start 12345678
> dma_map(DMA_FROM_DEVICE) no-op
> device writes partial data 12ABC678 <- ABC
> dma_unmap(DMA_FROM_DEVICE) 12ABC678
>
>
> SWIOTLB previously:
> Memory Bounce buffer
> ---------------------------------------------------
> start 12345678 xxxxxxxx
> dma_map(DMA_FROM_DEVICE) no-op
> device writes partial data 12345678 xxABCxxx <- ABC
> dma_unmap(DMA_FROM_DEVICE) xxABCxxx <- xxABCxxx
>
>
> SWIOTLB Now:
> Memory Bounce buffer
> ---------------------------------------------------
> start 12345678 xxxxxxxx
> dma_map(DMA_FROM_DEVICE) 12345678 -> 12345678
> device writes partial data 12345678 12ABC678 <- ABC
> dma_unmap(DMA_FROM_DEVICE) 12ABC678 <- 12ABC678
>
>
> Now, sure we can prevent any actual information leakage by initialising
> the bounce buffer slot with zeros, but then we're just corrupting the
> not-written-to parts of the mapping with zeros instead of anyone else's
> old data. That's still fundamentally not OK. The only thing SWIOTLB can
> do to be correct is treat DMA_FROM_DEVICE as a read-modify-write of the
> entire mapping, because it has no way to know how much of it is actually
> going to be modified.
>
Very nice explanation! Thanks!
> I'll admit I still never quite grasped the reason for also adding the
> override to swiotlb_sync_single_for_device() in aa6f8dcbab47, but I
> think by that point we were increasingly tired and confused and starting
> to second-guess ourselves (well, I was, at least).
I raised the question, do we need to do the same for
swiotlb_sync_single_for_device(). Did that based on my understanding of the
DMA API documentation. I had the following scenario in mind
SWIOTLB without the snyc_single:
Memory Bounce buffer Owner
--------------------------------------------------------------------------
start 12345678 xxxxxxxx C
dma_map(DMA_FROM_DEVICE) 12345678 -> 12345678 C->D
device writes partial data 12345678 12ABC678 <- ABC D
sync_for_cpu(DMA_FROM_DEVICE) 12ABC678 <- 12ABC678 D->C
cpu modifies buffer 66666666 12ABC678 C
sync_for_device(DMA_FROM_DEVICE) 66666666 12ABC678 C->D
device writes partial data 66666666 1EFGC678 <-EFG D
dma_unmap(DMA_FROM_DEVICE) 1EFGC678 <- 1EFGC678 D->C
Legend: in Owner column C stands for cpu and D for device.
Without swiotlb, I believe we should have arrived at 6EFG6666. To get the
same result, IMHO, we need to do a sync in sync_for_device().
And aa6f8dcbab47 is an imperfect solution to that (because of size).
> I don't think it's
> wrong per se, but as I said I do think it can bite anyone who's been
> doing dma_sync_*() wrong but getting away with it until now.
I fully agree.
> If
> ddbd89deb7d3 alone turns out to work OK then I'd be inclined to try a
> partial revert of just that one hunk.
>
I'm not against being pragmatic and doing the partial revert. But as
explained above, I do believe for correctness of swiotlb we ultimately
do need that change. So if the revert is the short term solution,
what should be our mid-term road-map?
Regards,
Halil
> Thanks,
> Robin.
Powered by blists - more mailing lists