[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <87v8vj1pfl.wl-maz@kernel.org>
Date: Fri, 08 Apr 2022 17:59:42 +0100
From: Marc Zyngier <maz@...nel.org>
To: Raghavendra Rao Ananta <rananta@...gle.com>
Cc: Andrew Jones <drjones@...hat.com>,
James Morse <james.morse@....com>,
Alexandru Elisei <alexandru.elisei@....com>,
Suzuki K Poulose <suzuki.poulose@....com>,
Paolo Bonzini <pbonzini@...hat.com>,
Catalin Marinas <catalin.marinas@....com>,
Will Deacon <will@...nel.org>, Peter Shier <pshier@...gle.com>,
Ricardo Koller <ricarkol@...gle.com>,
Oliver Upton <oupton@...gle.com>,
Reiji Watanabe <reijiw@...gle.com>,
Jing Zhang <jingzhangos@...gle.com>,
linux-arm-kernel@...ts.infradead.org, kvmarm@...ts.cs.columbia.edu,
linux-kernel@...r.kernel.org, kvm@...r.kernel.org
Subject: Re: [PATCH v5 02/10] KVM: arm64: Setup a framework for hypercall bitmap firmware registers
On Thu, 07 Apr 2022 18:24:14 +0100,
Raghavendra Rao Ananta <rananta@...gle.com> wrote:
>
> Hi Marc,
>
> > > +#define KVM_REG_ARM_STD_BIT_TRNG_V1_0 BIT(0)
> >
> > I'm really in two minds about this. Having one bit per service is easy
> > from an implementation perspective, but is also means that this
> > disallow fine grained control over which hypercalls are actually
> > available. If tomorrow TRNG 1.1 adds a new hypercall and that KVM
> > implements both, how does the selection mechanism works? You will
> > need a version selector (a la PSCI), which defeats this API somehow
> > (and renders the name of the #define invalid).
> >
> > I wonder if a more correct way to look at this is to enumerate the
> > hypercalls themselves (all 5 of them), though coming up with an
> > encoding is tricky (RNG32 and RNG64 would clash, for example).
> >
> > Thoughts?
> >
> I was on the fence about this too. The TRNG spec (ARM DEN 0098,
> Table-4) mentions that v1.0 should have VERSION, FEATURES, GET_UUID,
> and RND as mandatory features. Hence, if KVM advertised that it
> supports TRNG v1.0, I thought it would be best to expose all or
> nothing of v1.0 by guarding them with a single bit.
> Broadly, the idea is to have a bit per version. If v1.1 comes along,
> we can have another bit for that. If it's not too ugly to implement,
> we can be a little more aggressive and ensure that userspace doesn't
> enable v1.1 without enabling v1.0.
OK, that'd be assuming that we'll never see a service where version A
is incompatible with version B and that we have to exclude one or the
other. Meh. Let's cross that bridge once it is actually built.
[...]
> > > + mutex_lock(&kvm->lock);
> > > +
> > > + /*
> > > + * If the VM (any vCPU) has already started running, return success
> > > + * if there's no change in the value. Else, return -EBUSY.
> >
> > No, this should *always* fail if a vcpu has started. Otherwise, you
> > start allowing hard to spot races.
> >
> The idea came from the fact that userspace could spawn multiple
> threads to configure the vCPU registers. Since we don't have the
> VM-scoped registers yet, it may be possible that userspace has issued
> a KVM_RUN on one of the vCPU, while the others are lagging behind and
> still configuring the registers. The slower threads may see -EBUSY and
> could panic. But if you feel that it's an overkill and the userspace
> should deal with it, we can return EBUSY for all writes after KVM_RUN.
I'd rather have that. There already is stuff that rely on things not
changing once a vcpu has run, so I'd rather be consistent.
>
> > > + */
> > > + if (test_bit(KVM_ARCH_FLAG_HAS_RAN_ONCE, &kvm->arch.flags)) {
> > > + ret = *fw_reg_bmap != val ? -EBUSY : 0;
> > > + goto out;
> > > + }
> > > +
> > > + WRITE_ONCE(*fw_reg_bmap, val);
> >
> > I'm not sure what this WRITE_ONCE guards against. Do you expect
> > concurrent reads at this stage?
> >
> Again, the assumption here is that userspace could have multiple
> threads reading and writing to these registers. Without the VM scoped
> registers in place, we may end up with a read/write to the same memory
> location for all the vCPUs.
We only have one vcpu updating this at any given time (that's what the
lock ensures). A simple write should be OK, as far as I can tell.
Thanks,
M.
--
Without deviation from the norm, progress is not possible.
Powered by blists - more mailing lists