lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <86y0wrlrxt.wl-maz@kernel.org>
Date: Wed, 26 Mar 2025 15:42:06 +0000
From: Marc Zyngier <maz@...nel.org>
To: Sean Christopherson <seanjc@...gle.com>
Cc: Ankit Agrawal <ankita@...dia.com>,
	Catalin Marinas <catalin.marinas@....com>,
	Jason Gunthorpe <jgg@...dia.com>,
	Oliver Upton <oliver.upton@...ux.dev>,
	"joey.gouly@....com" <joey.gouly@....com>,
	"suzuki.poulose@....com" <suzuki.poulose@....com>,
	"yuzenghui@...wei.com" <yuzenghui@...wei.com>,
	"will@...nel.org" <will@...nel.org>,
	"ryan.roberts@....com" <ryan.roberts@....com>,
	"shahuang@...hat.com" <shahuang@...hat.com>,
	"lpieralisi@...nel.org" <lpieralisi@...nel.org>,
	"david@...hat.com" <david@...hat.com>,
	Aniket Agashe <aniketa@...dia.com>,
	Neo Jia <cjia@...dia.com>,
	Kirti Wankhede <kwankhede@...dia.com>,
	"Tarun Gupta (SW-GPU)" <targupta@...dia.com>,
	Vikram Sethi <vsethi@...dia.com>,
	Andy Currid <acurrid@...dia.com>,
	Alistair Popple <apopple@...dia.com>,
	John Hubbard <jhubbard@...dia.com>,
	Dan Williams <danw@...dia.com>,
	Zhi Wang <zhiw@...dia.com>,
	Matt Ochs <mochs@...dia.com>,
	Uday Dhoke <udhoke@...dia.com>,
	Dheeraj Nigam <dnigam@...dia.com>,
	Krishnakant Jaju <kjaju@...dia.com>,
	"alex.williamson@...hat.com" <alex.williamson@...hat.com>,
	"sebastianene@...gle.com" <sebastianene@...gle.com>,
	"coltonlewis@...gle.com" <coltonlewis@...gle.com>,
	"kevin.tian@...el.com" <kevin.tian@...el.com>,
	"yi.l.liu@...el.com" <yi.l.liu@...el.com>,
	"ardb@...nel.org" <ardb@...nel.org>,
	"akpm@...ux-foundation.org" <akpm@...ux-foundation.org>,
	"gshan@...hat.com" <gshan@...hat.com>,
	"linux-mm@...ck.org" <linux-mm@...ck.org>,
	"ddutile@...hat.com" <ddutile@...hat.com>,
	"tabba@...gle.com" <tabba@...gle.com>,
	"qperret@...gle.com" <qperret@...gle.com>,
	"kvmarm@...ts.linux.dev" <kvmarm@...ts.linux.dev>,
	"linux-kernel@...r.kernel.org" <linux-kernel@...r.kernel.org>,
	"linux-arm-kernel@...ts.infradead.org" <linux-arm-kernel@...ts.infradead.org>
Subject: Re: [PATCH v3 1/1] KVM: arm64: Allow cacheable stage 2 mapping using VMA flags

On Wed, 26 Mar 2025 14:53:34 +0000,
Sean Christopherson <seanjc@...gle.com> wrote:
> 
> On Wed, Mar 26, 2025, Ankit Agrawal wrote:
> > > On Wed, Mar 19, 2025 at 04:22:46PM -0300, Jason Gunthorpe wrote:
> > > > On Wed, Mar 19, 2025 at 06:11:02PM +0000, Catalin Marinas wrote:
> > > > > On Wed, Mar 19, 2025 at 02:04:29PM -0300, Jason Gunthorpe wrote:
> > > > > > On Wed, Mar 19, 2025 at 12:01:29AM -0700, Oliver Upton wrote:
> > > > > > > You have a very good point that KVM is broken for cacheable PFNMAP'd
> > > > > > > crap since we demote to something non-cacheable, and maybe that
> > > > > > > deserves fixing first. Hopefully nobody notices that we've taken away
> > > > > > > the toys...
> > > > > >
> > > > > > Fixing it is either faulting all access attempts or mapping it
> > > > > > cachable to the S2 (as this series is trying to do)..
> > > > >
> > > > > As I replied earlier, it might be worth doing both - fault on !FWB
> > > > > hardware (or rather reject the memslot creation), cacheable S2
> > > > > otherwise.
> > > >
> > > > I have no objection, Ankit are you able to make a failure patch?
> > >
> > > I'd wait until the KVM maintainers have their say.
> > > 
> > 
> > Maz, Oliver any thoughts on this? Can we conclude to create this failure
> > patch in memslot creation?
> 
> That's not sufficient.  As pointed out multiple times in this thread, any checks
> done at memslot creation are best effort "courtesies" provided to userspace to
> avoid terminating running VMs when the memory is faulted in.
> 
> I.e. checking at memslot creation is optional, checking at fault-in/mapping is
> not.
> 
> With that in place, I don't see any need for a memslot flag.  IIUC, without FWB,
> cacheable pfn-mapped memory is broken and needs to be disallowed.  But with FWB,
> KVM can simply honor the cacheability based on the VMA.  Neither of those requires

Remind me how this work with stuff such as guestmemfd, which, by
definition, doesn't have a userspace mapping?

> a memslot flag.  A KVM capability to enumerate FWB support would be nice though,
> e.g. so userspace can assert and bail early without ever hitting an
> ioctl error.

It's not "nice". It's mandatory. And FWB is definitely *not* something
we want to expose as such.

> 
> If we want to support existing setups that happen to work by dumb luck or careful
> configuration, then that should probably be an admin decision to support the
> "unsafe" behavior, i.e. an off-by-default KVM module param, not a memslot flag.

No. That's not how we handle an ABI issue. VM migration, with and
without FWB, can happen in both direction, and must have clear
semantics. So NAK to a kernel parameter.

If I have a VM with a device mapped as *device* on FWB host, I must be
able to migrate it to non-FWB host, and back. A device mapped as
*cacheable* can only be migrated between FWB-capable hosts.

Importantly, it is *userspace* that is in charge of deciding how the
device is mapped at S2. And the memslot flag is the correct
abstraction for that.

	M.

-- 
Without deviation from the norm, progress is not possible.

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ