lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Message-ID: <a2d95399-62ad-46d3-9e48-6fa90fd2c2f3@redhat.com>
Date: Fri, 20 Dec 2024 16:42:35 +0100
From: David Hildenbrand <david@...hat.com>
To: ankita@...dia.com, jgg@...dia.com, maz@...nel.org,
 oliver.upton@...ux.dev, joey.gouly@....com, suzuki.poulose@....com,
 yuzenghui@...wei.com, catalin.marinas@....com, will@...nel.org,
 ryan.roberts@....com, shahuang@...hat.com, lpieralisi@...nel.org
Cc: aniketa@...dia.com, cjia@...dia.com, kwankhede@...dia.com,
 targupta@...dia.com, vsethi@...dia.com, acurrid@...dia.com,
 apopple@...dia.com, jhubbard@...dia.com, danw@...dia.com, zhiw@...dia.com,
 mochs@...dia.com, udhoke@...dia.com, dnigam@...dia.com,
 alex.williamson@...hat.com, sebastianene@...gle.com, coltonlewis@...gle.com,
 kevin.tian@...el.com, yi.l.liu@...el.com, ardb@...nel.org,
 akpm@...ux-foundation.org, gshan@...hat.com, linux-mm@...ck.org,
 kvmarm@...ts.linux.dev, kvm@...r.kernel.org, linux-kernel@...r.kernel.org,
 linux-arm-kernel@...ts.infradead.org
Subject: Re: [PATCH v2 1/1] KVM: arm64: Allow cacheable stage 2 mapping using
 VMA flags

On 18.11.24 14:19, ankita@...dia.com wrote:
> From: Ankit Agrawal <ankita@...dia.com>
> 
> Currently KVM determines if a VMA is pointing at IO memory by checking
> pfn_is_map_memory(). However, the MM already gives us a way to tell what
> kind of memory it is by inspecting the VMA.

Do you primarily care about VM_PFNMAP/VM_MIXEDMAP VMAs, or also other 
VMA types?

> 
> This patch solves the problems where it is possible for the kernel to
> have VMAs pointing at cachable memory without causing
> pfn_is_map_memory() to be true, eg DAX memremap cases and CXL/pre-CXL
> devices. This memory is now properly marked as cachable in KVM.

Does this only imply in worse performance, or does this also affect 
correctness? I suspect performance is the problem, correct?

> 
> The pfn_is_map_memory() is restrictive and allows only for the memory
> that is added to the kernel to be marked as cacheable. In most cases
> the code needs to know if there is a struct page, or if the memory is
> in the kernel map and pfn_valid() is an appropriate API for this.
 > Extend the umbrella with pfn_valid() to include memory with no 
struct> pages for consideration to be mapped cacheable in stage 2. A 
!pfn_valid()
> implies that the memory is unsafe to be mapped as cacheable.


I do wonder, are there ways we could have a !(VM_PFNMAP/VM_MIXEDMAP) 
where kvm_is_device_pfn() == true? Are these the "DAX memremap cases and 
CXL/pre-CXL" things you describe above, or are they VM_PFNMAP/VM_MIXEDMAP?


It's worth nothing that COW VM_PFNMAP/VM_MIXEDMAP mappings are possible 
right now, where we could have anon pages mixed with PFN mappings. Of 
course, VMA pgrpot only partially apply to the anon pages (esp. caching 
attributes).

Likely you assume to never end up with COW VM_PFNMAP -- I think it's 
possible when doing a MAP_PRIVATE /dev/mem mapping on systems that allow 
for mapping /dev/mem. Maybe one could just reject such cases (if KVM PFN 
lookup code not already rejects them, which might just be that case IIRC).

> 
> Moreover take account of the mapping type in the VMA to make a decision
> on the mapping. The VMA's pgprot is tested to determine the memory type
> with the following mapping:
>   pgprot_noncached    MT_DEVICE_nGnRnE   device (or Normal_NC)
>   pgprot_writecombine MT_NORMAL_NC       device (or Normal_NC)
>   pgprot_device       MT_DEVICE_nGnRE    device (or Normal_NC)
>   pgprot_tagged       MT_NORMAL_TAGGED   RAM / Normal
>   -                   MT_NORMAL          RAM / Normal
> 
> Also take care of the following two cases that prevents the memory to
> be safely mapped as cacheable:
> 1. The VMA pgprot have VM_IO set alongwith MT_NORMAL or
>     MT_NORMAL_TAGGED. Although unexpected and wrong, presence of such
>     configuration cannot be ruled out.
> 2. Configurations where VM_MTE_ALLOWED is not set and KVM_CAP_ARM_MTE
>     is enabled. Otherwise a malicious guest can enable MTE at stage 1
>     without the hypervisor being able to tell. This could cause external
>     aborts.
> 
> Introduce a new variable noncacheable to represent whether the memory
> should not be mapped as cacheable. The noncacheable as false implies
> the memory is safe to be mapped cacheable.

Why not use ... "cacheable" ? This sentence would then read as:

"Introduce a new variable "cachable" to represent whether the memory
should be mapped as cacheable."

and maybe even could be dropped completely. :)

But maybe there is a reason for that in the code.

> Use this to handle the
> aforementioned potentially unsafe cases for cacheable mapping.
> 
> Note when FWB is not enabled, the kernel expects to trivially do
> cache management by flushing the memory by linearly converting a
> kvm_pte to phys_addr to a KVA, see kvm_flush_dcache_to_poc(). This is
> only possibile for struct page backed memory. Do not allow non-struct
> page memory to be cachable without FWB.
> 
> The device memory such as on the Grace Hopper systems is interchangeable
> with DDR memory and retains its properties. Allow executable faults
> on the memory determined as Normal cacheable.
> 
> Signed-off-by: Ankit Agrawal <ankita@...dia.com>
> Suggested-by: Catalin Marinas <catalin.marinas@....com>
> Suggested-by: Jason Gunthorpe <jgg@...dia.com>
> ---
-- 
Cheers,

David / dhildenb


Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ