linux-kernel - Re: [PATCH 07/16] KVM: arm64: Wire MMIO guard hypercalls

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <20210730131103.GD23756@willie-the-truck>
Date:   Fri, 30 Jul 2021 14:11:03 +0100
From:   Will Deacon <will@...nel.org>
To:     Marc Zyngier <maz@...nel.org>
Cc:     linux-arm-kernel@...ts.infradead.org, kvmarm@...ts.cs.columbia.edu,
        kvm@...r.kernel.org, linux-kernel@...r.kernel.org,
        qperret@...gle.com, dbrazdil@...gle.com,
        Srivatsa Vaddagiri <vatsa@...eaurora.org>,
        Shanker R Donthineni <sdonthineni@...dia.com>,
        James Morse <james.morse@....com>,
        Suzuki K Poulose <suzuki.poulose@....com>,
        Alexandru Elisei <alexandru.elisei@....com>,
        kernel-team@...roid.com
Subject: Re: [PATCH 07/16] KVM: arm64: Wire MMIO guard hypercalls

On Wed, Jul 28, 2021 at 11:47:20AM +0100, Marc Zyngier wrote:
> On Tue, 27 Jul 2021 19:11:46 +0100,
> Will Deacon <will@...nel.org> wrote:
> > 
> > On Thu, Jul 15, 2021 at 05:31:50PM +0100, Marc Zyngier wrote:
> > > Plumb in the hypercall interface to allow a guest to discover,
> > > enroll, map and unmap MMIO regions.
> > > 
> > > Signed-off-by: Marc Zyngier <maz@...nel.org>
> > > ---
> > >  arch/arm64/kvm/hypercalls.c | 20 ++++++++++++++++++++
> > >  include/linux/arm-smccc.h   | 28 ++++++++++++++++++++++++++++
> > >  2 files changed, 48 insertions(+)
> > > 
> > > diff --git a/arch/arm64/kvm/hypercalls.c b/arch/arm64/kvm/hypercalls.c
> > > index 30da78f72b3b..a3deeb907fdd 100644
> > > --- a/arch/arm64/kvm/hypercalls.c
> > > +++ b/arch/arm64/kvm/hypercalls.c
> > > @@ -5,6 +5,7 @@
> > >  #include <linux/kvm_host.h>
> > >  
> > >  #include <asm/kvm_emulate.h>
> > > +#include <asm/kvm_mmu.h>
> > >  
> > >  #include <kvm/arm_hypercalls.h>
> > >  #include <kvm/arm_psci.h>
> > > @@ -129,10 +130,29 @@ int kvm_hvc_call_handler(struct kvm_vcpu *vcpu)
> > >  	case ARM_SMCCC_VENDOR_HYP_KVM_FEATURES_FUNC_ID:
> > >  		val[0] = BIT(ARM_SMCCC_KVM_FUNC_FEATURES);
> > >  		val[0] |= BIT(ARM_SMCCC_KVM_FUNC_PTP);
> > > +		val[0] |= BIT(ARM_SMCCC_KVM_FUNC_MMIO_GUARD_INFO);
> > > +		val[0] |= BIT(ARM_SMCCC_KVM_FUNC_MMIO_GUARD_ENROLL);
> > > +		val[0] |= BIT(ARM_SMCCC_KVM_FUNC_MMIO_GUARD_MAP);
> > > +		val[0] |= BIT(ARM_SMCCC_KVM_FUNC_MMIO_GUARD_UNMAP);
> > >  		break;
> > >  	case ARM_SMCCC_VENDOR_HYP_KVM_PTP_FUNC_ID:
> > >  		kvm_ptp_get_time(vcpu, val);
> > >  		break;
> > > +	case ARM_SMCCC_VENDOR_HYP_KVM_MMIO_GUARD_INFO_FUNC_ID:
> > > +		val[0] = PAGE_SIZE;
> > > +		break;
> > 
> > I get the nagging feeling that querying the stage-2 page-size outside of
> > MMIO guard is going to be useful once we start looking at memory sharing,
> > so perhaps rename this to something more generic?
> 
> At this stage, why not follow the architecture and simply expose it as
> ID_AA64MMFR0_EL1.TGran{4,64,16}_2? That's exactly what it is for, and
> we already check for this in KVM itself.

Nice, I hadn't thought of that. On reflection, though, I don't agree that
it's "exactly what it is for" -- the ID register talks about the supported
stage-2 page-sizes, whereas we want to advertise the one page size that
we're currently using. In other words, it's important that we only ever
populate one of the fields and I wonder if that could bite us in future
somehow?

Up to you, you've definitely got a better feel for this than me.

> > > +	case ARM_SMCCC_VENDOR_HYP_KVM_MMIO_GUARD_ENROLL_FUNC_ID:
> > > +		set_bit(KVM_ARCH_FLAG_MMIO_GUARD, &vcpu->kvm->arch.flags);
> > > +		val[0] = SMCCC_RET_SUCCESS;
> > > +		break;
> > > +	case ARM_SMCCC_VENDOR_HYP_KVM_MMIO_GUARD_MAP_FUNC_ID:
> > > +		if (kvm_install_ioguard_page(vcpu, vcpu_get_reg(vcpu, 1)))
> > > +			val[0] = SMCCC_RET_SUCCESS;
> > > +		break;
> > > +	case ARM_SMCCC_VENDOR_HYP_KVM_MMIO_GUARD_UNMAP_FUNC_ID:
> > > +		if (kvm_remove_ioguard_page(vcpu, vcpu_get_reg(vcpu, 1)))
> > > +			val[0] = SMCCC_RET_SUCCESS;
> > > +		break;
> > 
> > I think there's a slight discrepancy between MAP and UNMAP here in that
> > calling UNMAP on something that hasn't been mapped will fail, whereas
> > calling MAP on something that's already been mapped will succeed. I think
> > that might mean you can't reason about the final state of the page if two
> > vCPUs race to call these functions in some cases (and both succeed).
> 
> I'm not sure that's the expected behaviour for ioremap(), for example
> (you can ioremap two portions of the same page successfully).

Hmm, good point. Does that mean we should be refcounting the stage-2?
Otherwise if we do something like:

	foo = ioremap(page, 0x100);
	bar = ioremap(page+0x100, 0x100);
	iounmap(foo);

then bar will break. Or did I miss something in the series?

Will