lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Date:   Sat, 5 Oct 2019 00:27:23 +0100
From:   Robin Murphy <robin.murphy@....com>
To:     Tim Harvey <tharvey@...eworks.com>
Cc:     Tirumalesh Chalamarla <tchalamarla@...iumnetworks.com>,
        Douglas Anderson <dianders@...omium.org>,
        Joerg Roedel <joro@...tes.org>,
        Will Deacon <will.deacon@....com>,
        linux-arm-msm@...r.kernel.org, evgreen@...omium.org,
        tfiga@...omium.org, Rob Clark <robdclark@...il.com>,
        iommu@...ts.linux-foundation.org,
        linux-arm-kernel@...ts.infradead.org,
        open list <linux-kernel@...r.kernel.org>
Subject: Re: [PATCH v2] iommu/arm-smmu: Break insecure users by disabling
 bypass by default

On 2019-10-04 9:37 pm, Tim Harvey wrote:
> On Fri, Oct 4, 2019 at 11:34 AM Robin Murphy <robin.murphy@....com> wrote:
>>
>> On 04/10/2019 18:13, Tim Harvey wrote:
>> [...]
>>>>> No difference... still need 'arm-smmu.disable_bypass=n' to boot. Are
>>>>> all four iommu-map props above supposed to be the same? Seems to me
>>>>> they all point to the same thing which looks wrong.
>>>>
>>>> Hmm... :/
>>>>
>>>> Those mappings just set Stream ID == PCI RID (strictly each one should
>>>> only need to cover the bus range assigned to that bridge, but it's not
>>>> crucial) which is the same thing the driver assumes for the mmu-masters
>>>> property, so either that's wrong and never could have worked anyway -
>>>> have you tried VFIO on this platform? - or there are other devices also
>>>> mastering through the SMMU that aren't described at all. Are you able to
>>>> capture a boot log? The SMMU faults do encode information about the
>>>> offending ID, and you can typically correlate their appearance
>>>> reasonably well with endpoint drivers probing.
>>>>
>>>
>>> Robin,
>>>
>>> VFIO is enabled in the kernel but I don't know anything about how to
>>> test/use it:
>>> $ grep VFIO .config
>>> CONFIG_KVM_VFIO=y
>>> CONFIG_VFIO_IOMMU_TYPE1=y
>>> CONFIG_VFIO_VIRQFD=y
>>> CONFIG_VFIO=y
>>> # CONFIG_VFIO_NOIOMMU is not set
>>> CONFIG_VFIO_PCI=y
>>> CONFIG_VFIO_PCI_MMAP=y
>>> CONFIG_VFIO_PCI_INTX=y
>>> # CONFIG_VFIO_PLATFORM is not set
>>> # CONFIG_VFIO_MDEV is not set
>>
>> No worries - since it's a networking-focused SoC I figured there was a
>> chance you might be using DPDK or similar userspace drivers with the NIC
>> VFs, but I was just casting around for a quick and easy baseline of
>> whether the SMMU works at all (another way would be using Qemu to run a
>> VM with one or more PCI devices assigned).
>>
>>> I do have a boot console yet I'm not seeing any smmu faults at all.
>>> Perhaps I've mis-diagnosed the issue completely. To be clear when I
>>> boot with arm-smmu.disable_bypass=y the serial console appears to not
>>> accept input in userspace and with arm-smmu.disable_bypass=n I'm fine.
>>> I'm using a buildroot initramfs rootfs for simplicity. The system
>>> isn't hung as I originally expected as the LED heartbeat trigger
>>> continues blinking... I just can't get console to accept input.
>>
>> Curiouser and curiouser... I'm inclined to suspect that the interrupt
>> configuration might also be messed up, such that the SMMU is blocking
>> traffic and jammed up due to pending faults, but you're not getting the
>> IRQ delivered to find out. Does this patch help reveal anything?
>>
>> http://linux-arm.org/git?p=linux-rm.git;a=commitdiff;h=29ac3648b580920692c9b417b2fc606995826517
>>
>> (untested, but it's a direct port of the one I've used for SMMUv3 to
>> diagnose something similar)
> 
> This shows:

Yay (ish)!

[ and the tangential challenge would be to find out what the real global 
fault interrupt is, 'cause apparently it's not SPI 68... ]

> arm-smmu 830000000000.smmu0: Unexpected global fault, this could be serious
> arm-smmu 830000000000.smmu0:     GFSR 0x80000002, GFSYNR0 0x00000002,
> GFSYNR1 0x00000140, GFSYNR2 0x00000000

If that stream ID were a PCI RID, it would be 01:08.0

> arm-smmu 830000000000.smmu0: Unexpected global fault, this could be serious
> arm-smmu 830000000000.smmu0:     GFSR 0x80000002, GFSYNR0 0x00000002,
> GFSYNR1 0x00000010, GFSYNR2 0x00000000

And this guy would be 00:02.0

So it seems that either the stream ID mapping is non-trivial (and 
"mmu-masters" couldn't have worked), or there are secret magic endpoints 
writing to memory during boot. Either way it probably needs some input 
from Cavium/Marvell to get straight. In the meantime, unless just 
disabling and ignoring the SMMU altogether is a viable option, I guess 
we have to resign to this being one of those "non-good" reasons for 
needing global bypass :(

Robin.

(note to self: it would probably be useful if we dump GFAR in these logs 
too. These are all writes, so it's possible they could be MSI attempts 
targeting the ITS rather than DMA as such)

> arm-smmu 830000000000.smmu0: Unexpected global fault, this could be serious
> arm-smmu 830000000000.smmu0:     GFSR 0x80000002, GFSYNR0 0x00000002,
> GFSYNR1 0x00000010, GFSYNR2 0x00000000
> arm-smmu 830000000000.smmu0: Unexpected global fault, this could be serious
> arm-smmu 830000000000.smmu0:     GFSR 0x80000002, GFSYNR0 0x00000002,
> GFSYNR1 0x00000010, GFSYNR2 0x00000000
> arm-smmu 830000000000.smmu0: Unexpected global fault, this could be serious
> arm-smmu 830000000000.smmu0:     GFSR 0x80000002, GFSYNR0 0x00000002,
> GFSYNR1 0x00000010, GFSYNR2 0x00000000
> arm-smmu 830000000000.smmu0: Unexpected global fault, this could be serious
> arm-smmu 830000000000.smmu0:     GFSR 0x80000002, GFSYNR0 0x00000002,
> GFSYNR1 0x00000010, GFSYNR2 0x00000000
> arm-smmu 830000000000.smmu0: Unexpected global fault, this could be serious
> arm-smmu 830000000000.smmu0:     GFSR 0x80000002, GFSYNR0 0x00000002,
> GFSYNR1 0x00000010, GFSYNR2 0x00000000
> arm-smmu 830000000000.smmu0: Unexpected global fault, this could be serious
> arm-smmu 830000000000.smmu0:     GFSR 0x80000002, GFSYNR0 0x00000002,
> GFSYNR1 0x00000010, GFSYNR2 0x00000000
> arm-smmu 830000000000.smmu0: Unexpected global fault, this could be serious
> arm-smmu 830000000000.smmu0:     GFSR 0x80000002, GFSYNR0 0x00000002,
> GFSYNR1 0x00000010, GFSYNR2 0x00000000
> arm-smmu 830000000000.smmu0: Unexpected global fault, this could be serious
> arm-smmu 830000000000.smmu0:     GFSR 0x80000002, GFSYNR0 0x00000002,
> GFSYNR1 0x00000010, GFSYNR2 0x00000000
> ...
> arm-smmu 830000000000.smmu0: Unexpected global fault, this could be serious
> arm-smmu 830000000000.smmu0:     GFSR 0x80000002, GFSYNR0 0x00000002,
> GFSYNR1 0x00000010, GFSYNR2 0x00000000
> ^^^ these two repeat over and over
> 
>>
>> That said, it's also puzzling that no other drivers are reporting DMA
>> errors or timeouts either - is there any chance that some device is set
>> running by the firmware/bootloader and not taken over by a kernel driver?
>>
> 
> anything is possible - I'm using the Cavium 'BDK' as boot firmware to
> configure the board which sits in from of arm trusted firmare and
> bootloader.
> 
> Tim
> 

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ