lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <87a6de80em.fsf@toke.dk>
Date:   Fri, 25 Mar 2022 17:25:21 +0100
From:   Toke Høiland-Jørgensen <toke@...e.dk>
To:     mbizon@...ebox.fr, Linus Torvalds <torvalds@...ux-foundation.org>
Cc:     Robin Murphy <robin.murphy@....com>,
        Christoph Hellwig <hch@....de>,
        Oleksandr Natalenko <oleksandr@...alenko.name>,
        Halil Pasic <pasic@...ux.ibm.com>,
        Marek Szyprowski <m.szyprowski@...sung.com>,
        Kalle Valo <kvalo@...nel.org>,
        "David S. Miller" <davem@...emloft.net>,
        Jakub Kicinski <kuba@...nel.org>,
        Paolo Abeni <pabeni@...hat.com>,
        Olha Cherevyk <olha.cherevyk@...il.com>,
        iommu <iommu@...ts.linux-foundation.org>,
        linux-wireless <linux-wireless@...r.kernel.org>,
        Netdev <netdev@...r.kernel.org>,
        Linux Kernel Mailing List <linux-kernel@...r.kernel.org>,
        Greg Kroah-Hartman <gregkh@...uxfoundation.org>,
        stable <stable@...r.kernel.org>
Subject: Re: [REGRESSION] Recent swiotlb DMA_FROM_DEVICE fixes break
 ath9k-based AP

Maxime Bizon <mbizon@...ebox.fr> writes:

> On Thu, 2022-03-24 at 12:26 -0700, Linus Torvalds wrote:
>
>> 
>> It's actually very natural in that situation to flush the caches from
>> the CPU side again. And so dma_sync_single_for_device() is a fairly
>> reasonable thing to do in that situation.
>> 
>
> In the non-cache-coherent scenario, and assuming dma_map() did an
> initial cache invalidation, you can write this:
>
> rx_buffer_complete_1(buf)
> {
> 	invalidate_cache(buf, size)
> 	if (!is_ready(buf))
> 		return;
> 	<proceed with receive>
> }
>
> or 
>
> rx_buffer_complete_2(buf)
> {
> 	if (!is_ready(buf)) {
> 		invalidate_cache(buf, size)
> 		return;
> 	}
> 	<proceed with receive>
> }
>
> The latter is preferred for performance because dma_map() did the
> initial invalidate.
>
> Of course you could write:
>
> rx_buffer_complete_3(buf)
> {
> 	invalidate_cache(buf, size)
> 	if
> (!is_ready(buf)) {
> 		invalidate_cache(buf, size)
> 		return;
> 	}
> 	
> <proceed with receive>
> }
>
>
> but it's a waste of CPU cycles
>
> So I'd be very cautious assuming sync_for_cpu() and sync_for_device()
> are both doing invalidation in existing implementation of arch DMA ops,
> implementers may have taken some liberty around DMA-API to avoid
> unnecessary cache operation (not to blame them).

I sense an implicit "and the driver can't (or shouldn't) influence
this" here, right?

> For example looking at arch/arm/mm/dma-mapping.c, for DMA_FROM_DEVICE
>
> sync_single_for_device()
>   => __dma_page_cpu_to_dev()
>     => dma_cache_maint_page(op=dmac_map_area)
>       => cpu_cache.dma_map_area()
>
> sync_single_for_cpu()
>   => __dma_page_dev_to_cpu()
>     =>
> __dma_page_cpu_to_dev(op=dmac_unmap_area)
>       =>
> cpu_cache.dma_unmap_area()
>
> dma_map_area() always does cache invalidate.
>
> But for a couple of CPU variant, dma_unmap_area() is a noop, so
> sync_for_cpu() does nothing.
>
> Toke's patch will break ath9k on those platforms (mostly silent
> breakage, rx corruption leading to bad performance)

Okay, so that would be bad obviously. So if I'm reading you correctly
(cf my question above), we can't fix this properly from the driver side,
and we should go with the partial SWIOTLB revert instead?

-Toke

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ