[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <8a60361f-b914-93ef-0d80-92ae4ad8b808@nvidia.com>
Date: Mon, 29 Jul 2019 22:33:04 +0100
From: Jon Hunter <jonathanh@...dia.com>
To: Jose Abreu <Jose.Abreu@...opsys.com>,
Robin Murphy <robin.murphy@....com>,
"linux-kernel@...r.kernel.org" <linux-kernel@...r.kernel.org>,
"netdev@...r.kernel.org" <netdev@...r.kernel.org>,
"linux-stm32@...md-mailman.stormreply.com"
<linux-stm32@...md-mailman.stormreply.com>,
"linux-arm-kernel@...ts.infradead.org"
<linux-arm-kernel@...ts.infradead.org>,
Catalin Marinas <catalin.marinas@....com>,
Will Deacon <will@...nel.org>
CC: Joao Pinto <Joao.Pinto@...opsys.com>,
Alexandre Torgue <alexandre.torgue@...com>,
Maxime Ripard <maxime.ripard@...tlin.com>,
Chen-Yu Tsai <wens@...e.org>,
Maxime Coquelin <mcoquelin.stm32@...il.com>,
linux-tegra <linux-tegra@...r.kernel.org>,
Giuseppe Cavallaro <peppe.cavallaro@...com>,
"David S . Miller" <davem@...emloft.net>
Subject: Re: [PATCH net-next 3/3] net: stmmac: Introducing support for Page
Pool
On 29/07/2019 15:08, Jose Abreu wrote:
...
>>> Hi Catalin and Will,
>>>
>>> Sorry to add you in such a long thread but we are seeing a DMA issue
>>> with stmmac driver in an ARM64 platform with IOMMU enabled.
>>>
>>> The issue seems to be solved when buffers allocation for DMA based
>>> transfers are *not* mapped with the DMA_ATTR_SKIP_CPU_SYNC flag *OR*
>>> when IOMMU is disabled.
>>>
>>> Notice that after transfer is done we do use
>>> dma_sync_single_for_{cpu,device} and then we reuse *the same* page for
>>> another transfer.
>>>
>>> Can you please comment on whether DMA_ATTR_SKIP_CPU_SYNC can not be used
>>> in ARM64 platforms with IOMMU ?
>>
>> In terms of what they do, there should be no difference on arm64 between:
>>
>> dma_map_page(..., dir);
>> ...
>> dma_unmap_page(..., dir);
>>
>> and:
>>
>> dma_map_page_attrs(..., dir, DMA_ATTR_SKIP_CPU_SYNC);
>> dma_sync_single_for_device(..., dir);
>> ...
>> dma_sync_single_for_cpu(..., dir);
>> dma_unmap_page_attrs(..., dir, DMA_ATTR_SKIP_CPU_SYNC);
>>
>> provided that the first sync covers the whole buffer and any subsequent
>> ones cover at least the parts of the buffer which may have changed. Plus
>> for coherent hardware it's entirely moot either way.
>
> Thanks for confirming. That's indeed what stmmac is doing when buffer is
> received by syncing the packet size to CPU.
>
>>
>> Given Jon's previous findings, I would lean towards the idea that
>> performing the extra (redundant) cache maintenance plus barrier in
>> dma_unmap is mostly just perturbing timing in the same way as the debug
>> print which also made things seem OK.
>
> Mikko said that Tegra186 is not coherent so we have to explicit flush
> pipeline but I don't understand why sync_single() is not doing it ...
>
> Jon, can you please remove *all* debug prints, hacks, etc ... and test
> this one in attach with plain -net tree ?
So far I have just been testing on the mainline kernel branch. The issue
still persists after applying this on mainline. I can test on the -net
tree, but I am not sure that will make a difference.
Cheers
Jon
--
nvpublic
Powered by blists - more mailing lists