[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <fcf648d2-70cc-d734-871a-ca7f745791b7@arm.com>
Date: Mon, 29 Jul 2019 12:52:02 +0100
From: Robin Murphy <robin.murphy@....com>
To: Jose Abreu <Jose.Abreu@...opsys.com>,
Jon Hunter <jonathanh@...dia.com>,
"linux-kernel@...r.kernel.org" <linux-kernel@...r.kernel.org>,
"netdev@...r.kernel.org" <netdev@...r.kernel.org>,
"linux-stm32@...md-mailman.stormreply.com"
<linux-stm32@...md-mailman.stormreply.com>,
"linux-arm-kernel@...ts.infradead.org"
<linux-arm-kernel@...ts.infradead.org>,
Catalin Marinas <catalin.marinas@....com>,
Will Deacon <will@...nel.org>
Cc: Joao Pinto <Joao.Pinto@...opsys.com>,
Alexandre Torgue <alexandre.torgue@...com>,
Maxime Ripard <maxime.ripard@...tlin.com>,
Chen-Yu Tsai <wens@...e.org>,
Maxime Coquelin <mcoquelin.stm32@...il.com>,
linux-tegra <linux-tegra@...r.kernel.org>,
Giuseppe Cavallaro <peppe.cavallaro@...com>,
"David S . Miller" <davem@...emloft.net>
Subject: Re: [PATCH net-next 3/3] net: stmmac: Introducing support for Page
Pool
On 29/07/2019 12:29, Jose Abreu wrote:
> ++ Catalin, Will (ARM64 Maintainers)
>
> From: Jon Hunter <jonathanh@...dia.com>
> Date: Jul/29/2019, 11:55:18 (UTC+00:00)
>
>>
>> On 29/07/2019 09:16, Jose Abreu wrote:
>>> From: Jose Abreu <joabreu@...opsys.com>
>>> Date: Jul/27/2019, 16:56:37 (UTC+00:00)
>>>
>>>> From: Jon Hunter <jonathanh@...dia.com>
>>>> Date: Jul/26/2019, 15:11:00 (UTC+00:00)
>>>>
>>>>>
>>>>> On 25/07/2019 16:12, Jose Abreu wrote:
>>>>>> From: Jon Hunter <jonathanh@...dia.com>
>>>>>> Date: Jul/25/2019, 15:25:59 (UTC+00:00)
>>>>>>
>>>>>>>
>>>>>>> On 25/07/2019 14:26, Jose Abreu wrote:
>>>>>>>
>>>>>>> ...
>>>>>>>
>>>>>>>> Well, I wasn't expecting that :/
>>>>>>>>
>>>>>>>> Per documentation of barriers I think we should set descriptor fields
>>>>>>>> and then barrier and finally ownership to HW so that remaining fields
>>>>>>>> are coherent before owner is set.
>>>>>>>>
>>>>>>>> Anyway, can you also add a dma_rmb() after the call to
>>>>>>>> stmmac_rx_status() ?
>>>>>>>
>>>>>>> Yes. I removed the debug print added the barrier, but that did not help.
>>>>>>
>>>>>> So, I was finally able to setup NFS using your replicated setup and I
>>>>>> can't see the issue :(
>>>>>>
>>>>>> The only difference I have from yours is that I'm using TCP in NFS
>>>>>> whilst you (I believe from the logs), use UDP.
>>>>>
>>>>> So I tried TCP by setting the kernel boot params to 'nfsvers=3' and
>>>>> 'proto=tcp' and this does appear to be more stable, but not 100% stable.
>>>>> It still appears to fail in the same place about 50% of the time.
>>>>>
>>>>>> You do have flow control active right ? And your HW FIFO size is >= 4k ?
>>>>>
>>>>> How can I verify if flow control is active?
>>>>
>>>> You can check it by dumping register MTL_RxQ_Operation_Mode (0xd30).
>>
>> Where would be the appropriate place to dump this? After probe? Maybe
>> best if you can share a code snippet of where to dump this.
>>
>>>> Can you also add IOMMU debug in file "drivers/iommu/iommu.c" ?
>>
>> You can find a boot log here:
>>
>> https://urldefense.proofpoint.com/v2/url?u=https-3A__paste.ubuntu.com_p_qtRqtYKHGF_&d=DwICaQ&c=DPL6_X_6JkXFx7AXWqB0tg&r=WHDsc6kcWAl4i96Vm5hJ_19IJiuxx_p_Rzo2g-uHDKw&m=NrxsR2etpZHGb7HkN4XdgaGmKM1XYyldihNPL6qVSv0&s=CMATEcHVoqZw4sIrNOXc7SFE_kV_5CO5EU21-yJez6c&e=
>>
>>> And, please try attached debug patch.
>>
>> With this patch it appears to boot fine. So far no issues seen.
>
> Thank you for testing.
>
> Hi Catalin and Will,
>
> Sorry to add you in such a long thread but we are seeing a DMA issue
> with stmmac driver in an ARM64 platform with IOMMU enabled.
>
> The issue seems to be solved when buffers allocation for DMA based
> transfers are *not* mapped with the DMA_ATTR_SKIP_CPU_SYNC flag *OR*
> when IOMMU is disabled.
>
> Notice that after transfer is done we do use
> dma_sync_single_for_{cpu,device} and then we reuse *the same* page for
> another transfer.
>
> Can you please comment on whether DMA_ATTR_SKIP_CPU_SYNC can not be used
> in ARM64 platforms with IOMMU ?
In terms of what they do, there should be no difference on arm64 between:
dma_map_page(..., dir);
...
dma_unmap_page(..., dir);
and:
dma_map_page_attrs(..., dir, DMA_ATTR_SKIP_CPU_SYNC);
dma_sync_single_for_device(..., dir);
...
dma_sync_single_for_cpu(..., dir);
dma_unmap_page_attrs(..., dir, DMA_ATTR_SKIP_CPU_SYNC);
provided that the first sync covers the whole buffer and any subsequent
ones cover at least the parts of the buffer which may have changed. Plus
for coherent hardware it's entirely moot either way.
Given Jon's previous findings, I would lean towards the idea that
performing the extra (redundant) cache maintenance plus barrier in
dma_unmap is mostly just perturbing timing in the same way as the debug
print which also made things seem OK.
Robin.
Powered by blists - more mailing lists