[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Message-ID: <f207da41-5222-4d1d-897d-9b288e33a547@intel.com>
Date: Thu, 9 May 2024 16:43:00 +0200
From: Alexander Lobakin <aleksander.lobakin@...el.com>
To: Robin Murphy <robin.murphy@....com>, Steven Price <steven.price@....com>,
Christoph Hellwig <hch@....de>
CC: Eric Dumazet <edumazet@...gle.com>, Jakub Kicinski <kuba@...nel.org>,
Marek Szyprowski <m.szyprowski@...sung.com>, Joerg Roedel <joro@...tes.org>,
Will Deacon <will@...nel.org>, "Rafael J. Wysocki" <rafael@...nel.org>,
Magnus Karlsson <magnus.karlsson@...el.com>,
<nex.sw.ncis.osdt.itp.upstreaming@...el.com>, <bpf@...r.kernel.org>,
<netdev@...r.kernel.org>, <iommu@...ts.linux.dev>,
<linux-kernel@...r.kernel.org>
Subject: Re: [PATCH v6 2/7] dma: avoid redundant calls for sync operations
From: Robin Murphy <robin.murphy@....com>
Date: Thu, 9 May 2024 15:33:13 +0100
> On 09/05/2024 2:43 pm, Steven Price wrote:
>> On 07/05/2024 12:20, Alexander Lobakin wrote:
>>> Quite often, devices do not need dma_sync operations on x86_64 at least.
>>> Indeed, when dev_is_dma_coherent(dev) is true and
>>> dev_use_swiotlb(dev) is false, iommu_dma_sync_single_for_cpu()
>>> and friends do nothing.
>>>
>>> However, indirectly calling them when CONFIG_RETPOLINE=y consumes about
>>> 10% of cycles on a cpu receiving packets from softirq at ~100Gbit rate.
>>> Even if/when CONFIG_RETPOLINE is not set, there is a cost of about 3%.
>>>
>>> Add dev->need_dma_sync boolean and turn it off during the device
>>> initialization (dma_set_mask()) depending on the setup:
>>> dev_is_dma_coherent() for the direct DMA, !(sync_single_for_device ||
>>> sync_single_for_cpu) or the new dma_map_ops flag, %DMA_F_CAN_SKIP_SYNC,
>>> advertised for non-NULL DMA ops.
>>> Then later, if/when swiotlb is used for the first time, the flag
>>> is reset back to on, from swiotlb_tbl_map_single().
>>>
>>> On iavf, the UDP trafficgen with XDP_DROP in skb mode test shows
>>> +3-5% increase for direct DMA.
>>>
>>> Suggested-by: Christoph Hellwig <hch@....de> # direct DMA shortcut
>>> Co-developed-by: Eric Dumazet <edumazet@...gle.com>
>>> Signed-off-by: Eric Dumazet <edumazet@...gle.com>
>>> Signed-off-by: Alexander Lobakin <aleksander.lobakin@...el.com>
>>
>> I've bisected a boot failure (on a Firefly RK3288) to this commit.
>> AFAICT the problem is that I have (at least) two drivers which don't
>> call dma_set_mask() and therefore never initialise the new dma_need_sync
>> variable.
>>
>> The specific drivers are "rockchip-drm" and "rk_gmac-dwmac". Is it a
>> requirement that all drivers engaging in DMA should call dma_set_mask()
>> - and therefore this has uncovered a bug in those drivers. Or is the
>> assumption that all drivers call dma_set_mask() faulty?
>
> Historically it's long been documented (at least in DMA-API-HOWTO) that
> a 32-bit DMA mask is assumed by default, so as much as we would prefer
> to shift expectations, there are still going to be a great many drivers
> relying on that :(
>
> Perhaps its time for dma-debug to start warning about implicit mask
> usage, maybe that might help push the agenda a bit?
I also thought of this, but currently don't know how to detect whether a
driver has called dma_set_mask*().
The fix will arrive in several minutes.
>
> Thanks,
> Robin.
Thanks,
Olek
Powered by blists - more mailing lists