linux-kernel - Re: [PATCH 2/2] arm64: Notify on pte permission upgrades

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <89dba89c-cb49-f917-31e4-3eafd484f4b2@arm.com>
Date:   Tue, 30 May 2023 14:44:11 +0100
From:   Robin Murphy <robin.murphy@....com>
To:     Jason Gunthorpe <jgg@...dia.com>
Cc:     Alistair Popple <apopple@...dia.com>,
        Andrew Morton <akpm@...ux-foundation.org>, will@...nel.org,
        catalin.marinas@....com, linux-mm@...ck.org,
        linux-kernel@...r.kernel.org, nicolinc@...dia.com,
        linux-arm-kernel@...ts.infradead.org, kvm@...r.kernel.org,
        John Hubbard <jhubbard@...dia.com>, zhi.wang.linux@...il.com,
        Sean Christopherson <seanjc@...gle.com>
Subject: Re: [PATCH 2/2] arm64: Notify on pte permission upgrades

On 30/05/2023 1:52 pm, Jason Gunthorpe wrote:
> On Tue, May 30, 2023 at 01:14:41PM +0100, Robin Murphy wrote:
>> On 2023-05-30 12:54, Jason Gunthorpe wrote:
>>> On Tue, May 30, 2023 at 06:05:41PM +1000, Alistair Popple wrote:
>>>>
>>>>>> As no notification is sent and the SMMU does not snoop TLB invalidates
>>>>>> it will continue to return read-only entries to a device even though
>>>>>> the CPU page table contains a writable entry. This leads to a
>>>>>> continually faulting device and no way of handling the fault.
>>>>>
>>>>> Doesn't the fault generate a PRI/etc? If we get a PRI maybe we should
>>>>> just have the iommu driver push an iotlb invalidation command before
>>>>> it acks it? PRI is already really slow so I'm not sure a pipelined
>>>>> invalidation is going to be a problem? Does the SMMU architecture
>>>>> permit negative caching which would suggest we need it anyhow?
>>>>
>>>> Yes, SMMU architecture (which matches the ARM architecture in regards to
>>>> TLB maintenance requirements) permits negative caching of some mapping
>>>> attributes including the read-only attribute. Hence without the flushing
>>>> we fault continuously.
>>>
>>> Sounds like a straight up SMMU bug, invalidate the cache after
>>> resolving the PRI event.
>>
>> No, if the IOPF handler calls back into the mm layer to resolve the fault,
>> and the mm layer issues an invalidation in the process of that which isn't
>> propagated back to the SMMU (as it would be if BTM were in use), logically
>> that's the mm layer's failing. The SMMU driver shouldn't have to issue extra
>> mostly-redundant invalidations just because different CPU architectures have
>> different idiosyncracies around caching of permissions.
> 
> The mm has a definition for invalidate_range that does not include all
> the invalidation points SMMU needs. This is difficult to sort out
> because this is general purpose cross arch stuff.
> 
> You are right that this is worth optimizing, but right now we have a
> -rc bug that needs fixing and adding and extra SMMU invalidation is a
> straightforward -rc friendly way to address it.

Sure; to clarify, I'm not against the overall idea of putting a hack in 
the SMMU driver with a big comment that it is a hack to work around 
missing notifications under SVA, but it would not constitute an "SMMU 
bug" to not do that. SMMU is just another VMSAv8-compatible MMU - if, 
say, KVM or some other arm64 hypervisor driver wanted to do something 
funky with notifiers to shadow stage 1 permissions for some reason, it 
would presumably be equally affected.

FWIW, the VT-d spec seems to suggest that invalidation on RO->RW is only 
optional if the requester supports recoverable page faults, so although 
there's no use-case for non-PRI-based SVA at the moment, there is some 
potential argument that the notifier issue generalises even to x86.

Thanks,
Robin.