lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <0e41ed2117c5cf5c97e2052aab65e39a5fef06f7.camel@redhat.com>
Date: Thu, 21 Nov 2024 22:28:26 -0500
From: Maxim Levitsky <mlevitsk@...hat.com>
To: Sean Christopherson <seanjc@...gle.com>
Cc: Paolo Bonzini <pbonzini@...hat.com>, Vitaly Kuznetsov
 <vkuznets@...hat.com>,  kvm@...r.kernel.org, linux-kernel@...r.kernel.org,
 Hou Wenlong <houwenlong.hwl@...group.com>, Kechen Lu <kechenl@...dia.com>,
 Oliver Upton <oliver.upton@...ux.dev>, Binbin Wu
 <binbin.wu@...ux.intel.com>, Yang Weijiang <weijiang.yang@...el.com>,
 Robert Hoo <robert.hoo.linux@...il.com>
Subject: Re: [PATCH v2 26/49] KVM: x86: Add a macro to init CPUID features
 that KVM emulates in software

On Wed, 2024-09-11 at 09:03 -0700, Sean Christopherson wrote:
> On Tue, Sep 10, 2024, Maxim Levitsky wrote:
> > On Mon, 2024-08-05 at 12:59 -0700, Sean Christopherson wrote:
> > > > And now we have:
> > > > 
> > > > kvm_cpu_cap_init_begin(CPUID_12_EAX);
> > > >  feature_scattered(SGX1);
> > > >  feature_scattered(SGX2);
> > > >  feature_scattered(SGX_EDECCSSA);
> > > > kvm_cpu_cap_init_end();
> > > 
> > > I don't love the syntax (mainly the need for a begin()+end()), but I'm a-ok
> > > getting rid of the @mask param/input.
> > > 
> > > What about making kvm_cpu_cap_init() a variadic macro, with the relevant features
> > > "unpacked" in the context of the macro?  That would avoid the need for a trailing
> > > macro, and would provide a clear indication of when/where the set of features is
> > > "initialized".
> > > 
> > > The biggest downside I see is that the last entry can't have a trailing comma,
> > > i.e. adding a new feature would require updating the previous feature too.
> > > 
> > > #define kvm_cpu_cap_init(leaf, init_features...)			\
> > > do {									\
> > > 	const struct cpuid_reg cpuid = x86_feature_cpuid(leaf * 32);	\
> > > 	const u32 __maybe_unused kvm_cpu_cap_init_in_progress = leaf;	\
> > > 	u32 kvm_cpu_cap_virtualized= 0;					\
> > > 	u32 kvm_cpu_cap_emulated = 0;					\
> > > 	u32 kvm_cpu_cap_synthesized = 0;				\
> > > 									\
> > > 	init_features;							\
> > > 									\
> > > 	kvm_cpu_caps[leaf] = kvm_cpu_cap_virtualized;			\
> > > 	kvm_cpu_caps[leaf] &= (raw_cpuid_get(cpuid) |			\
> > > 			       kvm_cpu_cap_synthesized);		\
> > > 	kvm_cpu_caps[leaf] |= kvm_cpu_cap_emulated;			\
> > > } while (0)
> > > 
> > > 	kvm_cpu_cap_init(CPUID_1_ECX,
> > > 		VIRTUALIZED_F(XMM3),
> > > 		VIRTUALIZED_F(PCLMULQDQ),
> > > 		VIRTUALIZED_F(SSSE3),
> > > 		VIRTUALIZED_F(FMA),
> > > 		VIRTUALIZED_F(CX16),
> > > 		VIRTUALIZED_F(PDCM),
> > > 		VIRTUALIZED_F(PCID),
> > > 		VIRTUALIZED_F(XMM4_1),
> > > 		VIRTUALIZED_F(XMM4_2),
> > > 		EMULATED_F(X2APIC),
> > > 		VIRTUALIZED_F(MOVBE),
> > > 		VIRTUALIZED_F(POPCNT),
> > > 		EMULATED_F(TSC_DEADLINE_TIMER),
> > > 		VIRTUALIZED_F(AES),
> > > 		VIRTUALIZED_F(XSAVE),
> > > 		// DYNAMIC_F(OSXSAVE),
> > > 		VIRTUALIZED_F(AVX),
> > > 		VIRTUALIZED_F(F16C),
> > > 		VIRTUALIZED_F(RDRAND),
> > > 		EMULATED_F(HYPERVISOR)
> > > 	);
> > 
> > Hi,
> > 
> > This is no doubt better than using '|'.
> > 
> > I still strongly prefer my version, because I don't really like the fact that
> > _F macros have side effects, and yet passed as parameters to the
> > kvm_cpu_cap_init function/macro.
> > 
> > Basically an unwritten rule, which I consider very important and because of which
> > I raised my concerns over this patch series is that if a function has side effects,
> > it should not be used as a parameter to another function, instead, it should be 
> > called explicitly on its own.
> 
> Splitting hairs to some degree, but the above suggestion is distinctly different
> than passing the _result_ of a function call as a parameter to another function.
> The actual "call" happens within the body of kvm_cpu_cap_init().

You are technically right but you use a wrong point of view: You know the implementation,
and I pretend that I don't know it, and try to look at this from the point of view
of someone who just looks a the code for the first time, e.g. to fix some bugs.

Someone who doesn't know anything about this, won't know if these are macros, cleverly
passed to another variadric macro (which is itself a feature that is not often used)

I just state the fact: a function or what looks like a function, result of which
is evaluated in expression or passed to another function (within a single statement)
should not have side effects. 
Only top level function/procedure calls allowed to have side effects - 
otherwise this is just confusing.

Let me explain this again with code:

When I see for the first time this:

result = foo(x) | bar(x);

I strongly expect both foo and bar to be pure functions with no side effects.

Or if I see this for the first time:

err = somefunc(foo(x), bar(x));

I also expect that foo and bar are pure functions,
but 'somefunc' might not be because it only returns an error code,
and it is a top level statement.

And I don't care if this is implemented with functions or macros, because it
looks the same.

This is just how my common sense works.

I won't argue though more about this, I don't want to bikeshed this and block this patch series.
If you insist, let it be, but please at least use the variadic macro.


> 
> This is effectively the same as passing a function pointer to a helper, and that
> function pointer implementation having side effects, which is quite common in the
> kernel and KVM, e.g. msr_access_t, rmap_handler_t, tdp_handler_t, gfn_handler_t,
> on_lock_fn_t, etc.
> 
> I 100% agree that it's unusual and subtle to essentially have a variable number
> of function pointers, but I don't see it as being an inherently bad pattern,
> especially since it is practically impossible to misuse _because_ the macro
> unpacks the "calls" at compile time.
> 
> IMO, the part that is most gross is the macros operating on local variables, but
> that behavior exists in all ideas we've discussed, probably because I'm pretty
> sure it's unavoidable unless we do something even worse (way, waaaaay worse).
> 
> E.g. we could add 32 versions of kvm_cpu_cap_init() that invoke pairs of parameters
> and pass in the variables
> 
>   fn1(f1, virtualized, emulated, synthesized)
>   fn2(f2, virtualized, emulated, synthesized)
>   fn3(f3, virtualized, emulated, synthesized)
>   ...
>   fnN(fN, virtualized, emulated, synthesized)
> 
> and
> 
>   kvm_cpu_cap_init19(CPUID_1_ECX,
> 	F, XMM3,
> 	F, PCLMULQDQ,
> 	F, SSE3,
> 	...
> 	EMULATED_F, HYPERVISOR
>   );

I don't think that this change is worth it, but this is still better in some sense,
because at least the user won't be able to make any assumptions about the above,
and instead will have to read the code and figure out what was done here.
It won't be easy though.

> 
> But that's beyond horrific :-)
> 
> > If you strongly prefer the variadic macro over my begin/end API, I can live with
> > that though, it is still better than '|'ing a mask with functions that have side
> > effects.



Best regards,
	Maxim Levitsky


Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ