lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <3255edfa-4465-204b-4751-8d40c8fb1382@arm.com>
Date:   Tue, 23 Jul 2019 14:19:55 +0100
From:   Robin Murphy <robin.murphy@....com>
To:     Jon Hunter <jonathanh@...dia.com>,
        Jose Abreu <Jose.Abreu@...opsys.com>,
        Lars Persson <lists@...h.nu>,
        Ilias Apalodimas <ilias.apalodimas@...aro.org>
Cc:     Joao Pinto <Joao.Pinto@...opsys.com>,
        Alexandre Torgue <alexandre.torgue@...com>,
        Maxime Ripard <maxime.ripard@...tlin.com>,
        "netdev@...r.kernel.org" <netdev@...r.kernel.org>,
        "linux-kernel@...r.kernel.org" <linux-kernel@...r.kernel.org>,
        "linux-stm32@...md-mailman.stormreply.com" 
        <linux-stm32@...md-mailman.stormreply.com>,
        Chen-Yu Tsai <wens@...e.org>,
        Maxime Coquelin <mcoquelin.stm32@...il.com>,
        linux-tegra <linux-tegra@...r.kernel.org>,
        Giuseppe Cavallaro <peppe.cavallaro@...com>,
        "David S . Miller" <davem@...emloft.net>,
        "linux-arm-kernel@...ts.infradead.org" 
        <linux-arm-kernel@...ts.infradead.org>
Subject: Re: [PATCH net-next 3/3] net: stmmac: Introducing support for Page
 Pool

On 23/07/2019 13:09, Jon Hunter wrote:
> 
> On 23/07/2019 11:29, Robin Murphy wrote:
>> On 23/07/2019 11:07, Jose Abreu wrote:
>>> From: Jon Hunter <jonathanh@...dia.com>
>>> Date: Jul/23/2019, 11:01:24 (UTC+00:00)
>>>
>>>> This appears to be a winner and by disabling the SMMU for the ethernet
>>>> controller and reverting commit 954a03be033c7cef80ddc232e7cbdb17df735663
>>>> this worked! So yes appears to be related to the SMMU being enabled. We
>>>> had to enable the SMMU for ethernet recently due to commit
>>>> 954a03be033c7cef80ddc232e7cbdb17df735663.
>>>
>>> Finally :)
>>>
>>> However, from "git show 954a03be033c7cef80ddc232e7cbdb17df735663":
>>>
>>> +         There are few reasons to allow unmatched stream bypass, and
>>> +         even fewer good ones.  If saying YES here breaks your board
>>> +         you should work on fixing your board.
>>>
>>> So, how can we fix this ? Is your ethernet DT node marked as
>>> "dma-coherent;" ?
>>
>> The first thing to try would be booting the failing setup with
>> "iommu.passthrough=1" (or using CONFIG_IOMMU_DEFAULT_PASSTHROUGH) - if
>> that makes things seem OK, then the problem is likely related to address
>> translation; if not, then it's probably time to start looking at nasties
>> like coherency and ordering, although in principle I wouldn't expect the
>> SMMU to have too much impact there.
> 
> Setting "iommu.passthrough=1" works for me. However, I am not sure where
> to go from here, so any ideas you have would be great.

OK, so that really implies it's something to do with the addresses. From 
a quick skim of the patch, I'm wondering if it's possible for buf->addr 
and buf->page->dma_addr to get out-of-sync at any point. The nature of 
the IOVA allocator makes it quite likely that a stale DMA address will 
have been reused for a new mapping, so putting the wrong address in a 
descriptor may well mean the DMA still ends up hitting a valid 
translation, but which is now pointing to a different page.

>> Do you know if the SMMU interrupts are working correctly? If not, it's
>> possible that an incorrect address or mapping direction could lead to
>> the DMA transaction just being silently terminated without any fault
>> indication, which generally presents as inexplicable weirdness (I've
>> certainly seen that on another platform with the mix of an unsupported
>> interrupt controller and an 'imperfect' ethernet driver).
> 
> If I simply remove the iommu node for the ethernet controller, then I
> see lots of ...
> 
> [    6.296121] arm-smmu 12000000.iommu: Unexpected global fault, this could be serious
> [    6.296125] arm-smmu 12000000.iommu:         GFSR 0x00000002, GFSYNR0 0x00000000, GFSYNR1 0x00000014, GFSYNR2 0x00000000
> 
> So I assume that this is triggering the SMMU interrupt correctly.

According to tegra186.dtsi it appears you're using the MMU-500 combined 
interrupt, so if global faults are being delivered then context faults 
*should* also, but I'd be inclined to try a quick hack of the relevant 
stmmac_desc_ops::set_addr callback to write some bogus unmapped address 
just to make sure arm_smmu_context_fault() then screams as expected, and 
we're not missing anything else.

Robin.

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ