lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite for Android: free password hash cracker in your pocket
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <690bd404204106fc17d465e2fdb9be8863767544.camel@redhat.com>
Date:   Tue, 07 Nov 2023 20:12:11 +0200
From:   Maxim Levitsky <mlevitsk@...hat.com>
To:     Sean Christopherson <seanjc@...gle.com>
Cc:     Yang Weijiang <weijiang.yang@...el.com>, pbonzini@...hat.com,
        kvm@...r.kernel.org, linux-kernel@...r.kernel.org,
        dave.hansen@...el.com, peterz@...radead.org, chao.gao@...el.com,
        rick.p.edgecombe@...el.com, john.allen@....com
Subject: Re: [PATCH v6 19/25] KVM: VMX: Emulate read and write to CET MSRs

On Thu, 2023-11-02 at 16:58 -0700, Sean Christopherson wrote:
> On Thu, Nov 02, 2023, Maxim Levitsky wrote:
> > On Wed, 2023-11-01 at 09:31 -0700, Sean Christopherson wrote:
> > > On Tue, Oct 31, 2023, Maxim Levitsky wrote:
> > > > On Thu, 2023-09-14 at 02:33 -0400, Yang Weijiang wrote:
> > > > > Add emulation interface for CET MSR access. The emulation code is split
> > > > > into common part and vendor specific part. The former does common check
> > > > > for MSRs and reads/writes directly from/to XSAVE-managed MSRs via the
> > > > > helpers while the latter accesses the MSRs linked to VMCS fields.
> > > > > 
> > > > > Signed-off-by: Yang Weijiang <weijiang.yang@...el.com>
> > > > > ---
> > > 
> > > ...
> > > 
> > > > > +	case MSR_IA32_PL0_SSP ... MSR_IA32_PL3_SSP:
> > > > > +	case MSR_KVM_SSP:
> > > > > +		if (host_msr_reset && kvm_cpu_cap_has(X86_FEATURE_SHSTK))
> > > > > +			break;
> > > > > +		if (!guest_can_use(vcpu, X86_FEATURE_SHSTK))
> > > > > +			return 1;
> > > > > +		if (index == MSR_KVM_SSP && !host_initiated)
> > > > > +			return 1;
> > > > > +		if (is_noncanonical_address(data, vcpu))
> > > > > +			return 1;
> > > > > +		if (index != MSR_IA32_INT_SSP_TAB && !IS_ALIGNED(data, 4))
> > > > > +			return 1;
> > > > > +		break;
> > > > Once again I'll prefer to have an ioctl for setting/getting SSP, this will
> > > > make the above code simpler (e.g there will be no need to check that write
> > > > comes from the host/etc).
> > > 
> > > I don't think an ioctl() would be simpler overall, especially when factoring in
> > > userspace.  With a synthetic MSR, we get the following quite cheaply:
> > > 
> > >  1. Enumerating support to userspace.
> > >  2. Save/restore of the value, e.g. for live migration.
> > >  3. Vendor hooks for propagating values to/from the VMCS/VMCB.
> > > 
> > > For an ioctl(), 
> > > #1 would require a capability, #2 (and #1 to some extent) would
> > > require new userspace flows, and #3 would require new kvm_x86_ops hooks.
> > > 
> > > The synthetic MSR adds a small amount of messiness, as does bundling 
> > > MSR_IA32_INT_SSP_TAB with the other shadow stack MSRs.  The bulk of the mess comes
> > > from the need to allow userspace to write '0' when KVM enumerated supported to
> > > userspace.
> > 
> > Let me put it this way - all hacks start like that, and in this case this is API/ABI hack
> > so we will have to live with it forever.
> 
> Eh, I don't view it as a hack, at least the kind of hack that has a negative
> connotation.  KVM effectively has ~240 MSR indices reserved for whatever KVM
> wants. 
This is exactly the problem. These indices are reserved for PV features, not
for fake msrs, and my fear is that once we mix it up, it will be a mess.

If that was not API/ABI, I wouldn't complain, but since this is API/ABI, I'm afraid
to make a mistake and then be sorry.


>  The only weird thing about this one is that it's not accessible from the
> guest.  Which I agree is quite weird, but from a code perspective I think it
> works quite well.
> 
> > Once there is a precedent, trust me there will be 10s of new 'fake' msrs added, and the
> > interface will become one big mess.
> 
> That suggests MSRs aren't already one big mess. :-)  I'm somewhat joking, but also
> somewhat serious.  I really don't think that adding one oddball synthetic MSR is
> going to meaningfully move the needle on the messiness of MSRs.
> 
> Hmm, there probably is a valid slippery slope argument though.  As you say, at
> some point, enough state will get shoved into hardware that KVM will need an ever
> growing number of synthetic MSRs to keep pace.

Yes, exactly what I mean - Honestly though I don't expect many new x86 registers/states
that are not msrs, but we don't know what x86 designers will do next,
and APIs are something that can't be fixed later.

> 
> > As I suggested, if you don't want to add new capability/ioctl and vendor
> > callback per new x86 arch register, then let's implement
> > KVM_GET_ONE_REG/KVM_SET_ONE_REG and then it will be really easy to add new
> > regs without confusing users, and without polluting msr namespace with msrs
> > that don't exist.
> 
> I definitely don't hate the idea of KVM_{G,S}ET_ONE_REG, what I don't want is to
> have an entirely separate path in KVM for handling the actual get/set.
> 
> What if we combine the approaches?  Add KVM_{G,S}ET_ONE_REG support so that the
> uAPI can use completely arbitrary register indices without having to worry about
> polluting the MSR space and making MSR_KVM_SSP ABI.
Sounds like a reasonable idea but might be overkill.

> 
> Ooh, if we're clever, I bet we can extend KVM_{G,S}ET_ONE_REG to also work with
> existing MSRs, GPRs, and other stuff,

Not sure if we want to make it work with MSRs. MSRs are a very well defined thing
on x86, and we already have an API to read/write them. Other registers maybe,
don't know.

>  i.e. not force userspace through the funky
> KVM_SET_MSRS just to set one reg, and not force a RMW of all GPRs just to set
> RIP or something.
Setting one GPR like RIP does sound like a valid use case of KVM_SET_ONE_REG.

>   E.g. use bits 39:32 of the id to encode the register class,
> bits 31:0 to hold the index within a class, and reserve bits 63:40 for future
> usage.
> 
> Then for KVM-defined registers, we can route them internally as needed, e.g. we
> can still define MSR_KVM_SSP so that internal it's treated like an MSR, but its
> index isn't ABI and so can be changed at will.  And future KVM-defined registers
> wouldn't _need_ to be treated like MSRs, i.e. we could route registers through
> the MSR APIs if and only if it makes sense to do so.

I am not sure that even internally I'll treat MSR_KVM_SSP as MSR. 
An MSR IMHO is a msr, a register is a register, mixing this up will
just add to the confusion.


Honestly if I were to add support for the SSP register, I'll just add a new
ioctl/capability and vendor callback. All of this code is just harmless
boilerplate code.
Even using KVM_GET_ONE_REG/KVM_SET_ONE_REG is probably overkill, although using
it for new registers is reasonable.

At the end I am not going to argue much about this - I just voiced my option that currently
MSR read/write interface is pure in regard that it only works on either real msrs or
at least PV msrs that the guest can read/write. 

All other guest's state is set via separate ioctls/callbacks/etc, and thus it's more consistent
from API POV to add SSP here.


> 
> Side topic, why on earth is the data value of kvm_one_reg "addr"?

I don't know, probably something ARM related.

> 


Best regards,
	Maxim Levitsky

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ