lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <2870c470-06c8-aa9c-0257-3f9652a4ccd8@amd.com>
Date: Fri, 20 Sep 2024 10:45:15 +0530
From: "Nikunj A. Dadhania" <nikunj@....com>
To: Sean Christopherson <seanjc@...gle.com>
Cc: linux-kernel@...r.kernel.org, thomas.lendacky@....com, bp@...en8.de,
 x86@...nel.org, kvm@...r.kernel.org, mingo@...hat.com, tglx@...utronix.de,
 dave.hansen@...ux.intel.com, pgonda@...gle.com, pbonzini@...hat.com
Subject: Re: [PATCH v11 19/20] x86/kvmclock: Skip kvmclock when Secure TSC is
 available

On 9/18/2024 5:37 PM, Sean Christopherson wrote:
> On Mon, Sep 16, 2024, Nikunj A. Dadhania wrote:
>> On 9/13/2024 11:00 PM, Sean Christopherson wrote:
>>>> Signed-off-by: Nikunj A Dadhania <nikunj@....com>
>>>> Tested-by: Peter Gonda <pgonda@...gle.com>
>>>> ---
>>>>  arch/x86/kernel/kvmclock.c | 2 +-
>>>>  1 file changed, 1 insertion(+), 1 deletion(-)
>>>>
>>>> diff --git a/arch/x86/kernel/kvmclock.c b/arch/x86/kernel/kvmclock.c
>>>> index 5b2c15214a6b..3d03b4c937b9 100644
>>>> --- a/arch/x86/kernel/kvmclock.c
>>>> +++ b/arch/x86/kernel/kvmclock.c
>>>> @@ -289,7 +289,7 @@ void __init kvmclock_init(void)
>>>>  {
>>>>  	u8 flags;
>>>>  
>>>> -	if (!kvm_para_available() || !kvmclock)
>>>> +	if (!kvm_para_available() || !kvmclock || cc_platform_has(CC_ATTR_GUEST_SECURE_TSC))
>>>
>>> I would much prefer we solve the kvmclock vs. TSC fight in a generic way.  Unless
>>> I've missed something, the fact that the TSC is more trusted in the SNP/TDX world
>>> is simply what's forcing the issue, but it's not actually the reason why Linux
>>> should prefer the TSC over kvmclock.  The underlying reason is that platforms that
>>> support SNP/TDX are guaranteed to have a stable, always running TSC, i.e. that the
>>> TSC is a superior timesource purely from a functionality perspective.  That it's
>>> more secure is icing on the cake.
>>
>> Are you suggesting that whenever the guest is either SNP or TDX, kvmclock
>> should be disabled assuming that timesource is stable and always running?
> 
> No, I'm saying that the guest should prefer the raw TSC over kvmclock if the TSC
> is stable, irrespective of SNP or TDX.  This is effectively already done for the
> timekeeping base (see commit 7539b174aef4 ("x86: kvmguest: use TSC clocksource if
> invariant TSC is exposed")), but the scheduler still uses kvmclock thanks to the
> kvm_sched_clock_init() code.

The kvm-clock and tsc-early both are having the rating of 299. As they are of
same rating, kvm-clock is being picked up first.

Is it fine to drop the clock rating of kvmclock to 298 ? With this tsc-early will
be picked up instead.

diff --git a/arch/x86/kernel/kvmclock.c b/arch/x86/kernel/kvmclock.c
index fafdbf813ae3..1982cee74354 100644
--- a/arch/x86/kernel/kvmclock.c
+++ b/arch/x86/kernel/kvmclock.c
@@ -289,7 +289,7 @@ void __init kvmclock_init(void)
 {
 	u8 flags;
 
-	if (!kvm_para_available() || !kvmclock || cc_platform_has(CC_ATTR_GUEST_SNP_SECURE_TSC))
+	if (!kvm_para_available() || !kvmclock)
 		return;
 
 	if (kvm_para_has_feature(KVM_FEATURE_CLOCKSOURCE2)) {
@@ -342,7 +342,7 @@ void __init kvmclock_init(void)
 	if (boot_cpu_has(X86_FEATURE_CONSTANT_TSC) &&
 	    boot_cpu_has(X86_FEATURE_NONSTOP_TSC) &&
 	    !check_tsc_unstable())
-		kvm_clock.rating = 299;
+		kvm_clock.rating = 298;
 
 	clocksource_register_hz(&kvm_clock, NSEC_PER_SEC);
 	pv_info.name = "KVM";



[    0.000000] kvm-clock: Using msrs 4b564d01 and 4b564d00
[    0.000001] kvm-clock: using sched offset of 6630881179920185 cycles
[    0.001266] clocksource: kvm-clock: mask: 0xffffffffffffffff max_cycles: 0x1cd42e4dffb, max_idle_ns: 881590591483 ns
[    0.005347] tsc: Detected 1996.247 MHz processor
[    0.263100] clocksource: tsc-early: mask: 0xffffffffffffffff max_cycles: 0x398caa9ddcb, max_idle_ns: 881590739785 ns
[    0.980456] clocksource: Switched to clocksource tsc-early
[    1.059332] clocksource: tsc: mask: 0xffffffffffffffff max_cycles: 0x398caa9ddcb, max_idle_ns: 881590739785 ns
[    1.062094] clocksource: Switched to clocksource tsc

> The other aspect of this to consider is wallclock.  If I'm reading the code
> correctly, _completely_ disabling kvmclock will case the kernel to keep using the
> RTC for wallclock.  Using the RTC is an amusingly bad decision for SNP and TDX
> (and regular VMs), as the RTC is a slooow emulation path and it's still very much
> controlled by the untrusted host.

Right, this is not expected.
 
> Unless you have a better idea for what to do with wallclock, I think the right
> approach is to come up a cleaner way to prefer TSC over kvmclock for timekeeping
> and the scheduler, but leave wallclock as-is.  And then for SNP and TDX, "assert"
> that the TSC is being used instead of kvmclock.  Presumably, all SNP and TDX
> hosts provide a stable TSC, so there's probably no reason for the guest to even
> care if the TSC is "secure".
> 
> Note, past me missed the wallclock side of things[*], so I don't think hiding
> kvmclock entirely is the best solution.
> 
> [*] https://lore.kernel.org/all/ZOjF2DMBgW%2FzVvL3@google.com

Regards
Nikunj

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ