linux-kernel - Re: [PATCH v2 5/7] KVM: arm64: MTE: Use stage-2 NoTagAccess memory attribute if supported

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <bfd6361b-1a6f-43e5-9ad1-80692c146d61@arm.com>
Date: Tue, 14 Jan 2025 09:55:32 +0000
From: Suzuki K Poulose <suzuki.poulose@....com>
To: Peter Collingbourne <pcc@...gle.com>,
 Catalin Marinas <catalin.marinas@....com>
Cc: "Aneesh Kumar K.V" <aneesh.kumar@...nel.org>,
 linux-kernel@...r.kernel.org, linux-arm-kernel@...ts.infradead.org,
 kvmarm@...ts.linux.dev, Steven Price <steven.price@....com>,
 Will Deacon <will@...nel.org>, Marc Zyngier <maz@...nel.org>,
 Mark Rutland <mark.rutland@....com>, Oliver Upton <oliver.upton@...ux.dev>,
 Joey Gouly <joey.gouly@....com>, Zenghui Yu <yuzenghui@...wei.com>
Subject: Re: [PATCH v2 5/7] KVM: arm64: MTE: Use stage-2 NoTagAccess memory
 attribute if supported

Hi,


On 13/01/2025 20:47, Peter Collingbourne wrote:
> On Mon, Jan 13, 2025 at 11:09 AM Catalin Marinas
> <catalin.marinas@....com> wrote:
>>
>> On Sat, Jan 11, 2025 at 06:49:55PM +0530, Aneesh Kumar K.V wrote:
>>> Catalin Marinas <catalin.marinas@....com> writes:
>>>> On Fri, Jan 10, 2025 at 04:30:21PM +0530, Aneesh Kumar K.V (Arm) wrote:
>>>>> Currently, the kernel won't start a guest if the MTE feature is enabled
>>>
>>> ...
>>>
>>>>> @@ -2152,7 +2162,8 @@ int kvm_arch_prepare_memory_region(struct kvm *kvm,
>>>>>             if (!vma)
>>>>>                     break;
>>>>>
>>>>> -          if (kvm_has_mte(kvm) && !kvm_vma_mte_allowed(vma)) {
>>>>> +          if (kvm_has_mte(kvm) &&
>>>>> +              !kvm_has_mte_perm(kvm) && !kvm_vma_mte_allowed(vma)) {
>>>>>                     ret = -EINVAL;
>>>>>                     break;
>>>>>             }
>>>>
>>>> I don't think we should change this, or at least not how it's done above
>>>> (Suzuki raised a related issue internally relaxing this for VM_PFNMAP).
>>>>
>>>> For standard memory slots, we want to reject them upfront rather than
>>>> deferring to the fault handler. An example here is file mmap() passed as
>>>> standard RAM to the VM. It's an unnecessary change in behaviour IMHO.
>>>> I'd only relax this for VM_PFNMAP mappings further down in this
>>>> function (and move the VM_PFNMAP check above; see Suzuki's internal
>>>> patch, unless he posted it publicly already).

For the record, here is the patch Catalin was referring to.


--->8---

kvm: arm64: Allow device mappings with MTE

When a VM is configured to use MTE, we prevent mapping a "Device" to the VM.

e.g: with kvmtool (with additional debugging to dump error code and 
addresses)

$ lkvm run ... --vfio 0000:01:00.0 ...

  Info: Using IOMMU type 3 for VFIO container
    Error: 0000:01:00.0: failed to register region with KVM 
50020000-50022000: -22
       Warning: [0abc:aced] Error activating emulation for BAR 0
       Error: 0000:01:00.0: failed to configure regions
       Warning: Failed init: vfio__init

Only check for the MTE allowed for a VMA if it is not backed by "device"

Fixes: ea7fc1bb1cd1b ("KVM: arm64: Introduce MTE VM feature")
Cc: Steven Price <steven.price at arm.com>
Cc: Marc Zyngier <maz at kernel.org>
Signed-off-by: Suzuki K Poulose <suzuki.poulose at arm.com>
---
  arch/arm64/kvm/mmu.c | 8 +++-----
  1 file changed, 3 insertions(+), 5 deletions(-)

diff --git a/arch/arm64/kvm/mmu.c b/arch/arm64/kvm/mmu.c
index 8b2d803e558a..7e084f22e185 100644
--- a/arch/arm64/kvm/mmu.c
+++ b/arch/arm64/kvm/mmu.c
@@ -2295,17 +2295,15 @@ int kvm_arch_prepare_memory_region(struct kvm *kvm,
  		if (!vma)
  			break;

-		if (kvm_has_mte(kvm) && !kvm_vma_mte_allowed(vma)) {
-			ret = -EINVAL;
-			break;
-		}
-
  		if (vma->vm_flags & VM_PFNMAP) {
  			/* IO region dirty page logging not allowed */
  			if (new->flags & KVM_MEM_LOG_DIRTY_PAGES) {
  				ret = -EINVAL;
  				break;
  			}
+		} else if (kvm_has_mte(kvm) && !kvm_vma_mte_allowed(vma)) {
+			ret = -EINVAL;
+			break;
  		}
  		hva = min(reg_end, vma->vm_end);
  	} while (hva < reg_end);
-- 
2.34.1


Suzuki


>>>
>>> But we want to handle memslots backed by pagecache pages for virtio-shm
>>> here (virtiofs dax use case).
>>
>> Ah, I forgot about this use case. So with virtiofs DAX, does a host page
>> cache page (host VMM mmap()) get mapped directly into the guest as a
>> separate memory slot? In this case, the host vma would not have
>> VM_MTE_ALLOWED set.
>>
>>> With MTE_PERM, we can essentially skip the
>>> kvm_vma_mte_allowed(vma) check because we handle all types in the fault
>>> handler.
>>
>> This was pretty much the early behaviour when we added KVM support for
>> MTE, allow !VM_MTE_ALLOWED and trap them later. However, we disallowed
>> VM_SHARED because of some non-trivial race. Commit d89585fbb308 ("KVM:
>> arm64: unify the tests for VMAs in memslots when MTE is enabled")
>> changed this behaviour and the VM_MTE_ALLOWED check happens upfront. A
>> subsequent commit removed the VM_SHARED check.
>>
>> It's a minor ABI change but I'm trying to figure out why we needed this
>> upfront check rather than simply dropping the VM_SHARED check. Adding
>> Peter in case he remembers. I can't see any race if we simply skipped
>> this check altogether, irrespective of FEAT_MTE_PERM.
> 
> I don't see a problem with removing the upfront check. The reason I
> kept the check was IIRC just that there was already a check there and
> its logic needed to be adjusted for my VM_SHARED changes.
> 
> Peter