[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20210407151458.GC21451@arm.com>
Date: Wed, 7 Apr 2021 16:14:58 +0100
From: Catalin Marinas <catalin.marinas@....com>
To: Steven Price <steven.price@....com>
Cc: David Hildenbrand <david@...hat.com>,
Mark Rutland <mark.rutland@....com>,
Peter Maydell <peter.maydell@...aro.org>,
"Dr. David Alan Gilbert" <dgilbert@...hat.com>,
Andrew Jones <drjones@...hat.com>, Haibo Xu <Haibo.Xu@....com>,
Suzuki K Poulose <suzuki.poulose@....com>,
qemu-devel@...gnu.org, Marc Zyngier <maz@...nel.org>,
Juan Quintela <quintela@...hat.com>,
Richard Henderson <richard.henderson@...aro.org>,
linux-kernel@...r.kernel.org, Dave Martin <Dave.Martin@....com>,
James Morse <james.morse@....com>,
linux-arm-kernel@...ts.infradead.org,
Thomas Gleixner <tglx@...utronix.de>,
Will Deacon <will@...nel.org>, kvmarm@...ts.cs.columbia.edu,
Julien Thierry <julien.thierry.kdev@...il.com>
Subject: Re: [PATCH v10 2/6] arm64: kvm: Introduce MTE VM feature
On Wed, Apr 07, 2021 at 11:20:18AM +0100, Steven Price wrote:
> On 31/03/2021 19:43, Catalin Marinas wrote:
> > When a slot is added by the VMM, if it asked for MTE in guest (I guess
> > that's an opt-in by the VMM, haven't checked the other patches), can we
> > reject it if it's is going to be mapped as Normal Cacheable but it is a
> > ZONE_DEVICE (i.e. !kvm_is_device_pfn() + one of David's suggestions to
> > check for ZONE_DEVICE)? This way we don't need to do more expensive
> > checks in set_pte_at().
>
> The problem is that KVM allows the VMM to change the memory backing a slot
> while the guest is running. This is obviously useful for the likes of
> migration, but ultimately means that even if you were to do checks at the
> time of slot creation, you would need to repeat the checks at set_pte_at()
> time to ensure a mischievous VMM didn't swap the page for a problematic one.
Does changing the slot require some KVM API call? Can we intercept it
and do the checks there?
Maybe a better alternative for the time being is to add a new
kvm_is_zone_device_pfn() and force KVM_PGTABLE_PROT_DEVICE if it returns
true _and_ the VMM asked for MTE in guest. We can then only set
PG_mte_tagged if !device.
We can later relax this further to Normal Non-cacheable for ZONE_DEVICE
memory (via a new KVM_PGTABLE_PROT_NORMAL_NC) or even Normal Cacheable
if we manage to change the behaviour of the architecture.
> > We could add another PROT_TAGGED or something which means PG_mte_tagged
> > set but still mapped as Normal Untagged. It's just that we are short of
> > pte bits for another flag.
>
> That could help here - although it's slightly odd as you're asking the
> kernel to track the tags, but not allowing user space (direct) access to
> them. Like you say using us the precious bits for this seems like it might
> be short-sighted.
Yeah, let's scrap this idea. We set PG_mte_tagged in user_mem_abort(),
so we already know it's a page potentially containing tags. On
restoring from swap, we need to check for MTE metadata irrespective of
whether the user pte is tagged or not, as you already did in patch 1.
I'll get back to that and look at the potential races.
BTW, after a page is restored from swap, how long do we keep the
metadata around? I think we can delete it as soon as it was restored and
PG_mte_tagged was set. Currently it looks like we only do this when the
actual page was freed or swapoff. I haven't convinced myself that it's
safe to do this for swapoff unless it guarantees that all the ptes
sharing a page have been restored.
--
Catalin
Powered by blists - more mailing lists