[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <ef194c25-22d8-204e-ffb6-8f9f0a0621fb@amd.com>
Date: Wed, 25 Sep 2024 14:23:15 +0530
From: "Nikunj A. Dadhania" <nikunj@....com>
To: Sean Christopherson <seanjc@...gle.com>
Cc: linux-kernel@...r.kernel.org, thomas.lendacky@....com, bp@...en8.de,
x86@...nel.org, kvm@...r.kernel.org, mingo@...hat.com, tglx@...utronix.de,
dave.hansen@...ux.intel.com, pgonda@...gle.com, pbonzini@...hat.com,
peterz@...radead.org, gautham.shenoy@....com
Subject: Re: [PATCH v11 19/20] x86/kvmclock: Skip kvmclock when Secure TSC is
available
On 9/20/2024 2:24 PM, Nikunj A. Dadhania wrote:
> On 9/20/2024 12:51 PM, Sean Christopherson wrote:
>> On Fri, Sep 20, 2024, Nikunj A. Dadhania wrote:
>>> On 9/18/2024 5:37 PM, Sean Christopherson wrote:
>>>> On Mon, Sep 16, 2024, Nikunj A. Dadhania wrote:
>>>>> On 9/13/2024 11:00 PM, Sean Christopherson wrote:
>>>>>>> Signed-off-by: Nikunj A Dadhania <nikunj@....com>
>>>>>>> Tested-by: Peter Gonda <pgonda@...gle.com>
>>>>>>> ---
>>>>>>> arch/x86/kernel/kvmclock.c | 2 +-
>>>>>>> 1 file changed, 1 insertion(+), 1 deletion(-)
>>>>>>>
>>>>>>> diff --git a/arch/x86/kernel/kvmclock.c b/arch/x86/kernel/kvmclock.c
>>>>>>> index 5b2c15214a6b..3d03b4c937b9 100644
>>>>>>> --- a/arch/x86/kernel/kvmclock.c
>>>>>>> +++ b/arch/x86/kernel/kvmclock.c
>>>>>>> @@ -289,7 +289,7 @@ void __init kvmclock_init(void)
>>>>>>> {
>>>>>>> u8 flags;
>>>>>>>
>>>>>>> - if (!kvm_para_available() || !kvmclock)
>>>>>>> + if (!kvm_para_available() || !kvmclock || cc_platform_has(CC_ATTR_GUEST_SECURE_TSC))
>>>>>>
>>>>>> I would much prefer we solve the kvmclock vs. TSC fight in a generic way. Unless
>>>>>> I've missed something, the fact that the TSC is more trusted in the SNP/TDX world
>>>>>> is simply what's forcing the issue, but it's not actually the reason why Linux
>>>>>> should prefer the TSC over kvmclock. The underlying reason is that platforms that
>>>>>> support SNP/TDX are guaranteed to have a stable, always running TSC, i.e. that the
>>>>>> TSC is a superior timesource purely from a functionality perspective. That it's
>>>>>> more secure is icing on the cake.
>>>>>
>>>>> Are you suggesting that whenever the guest is either SNP or TDX, kvmclock
>>>>> should be disabled assuming that timesource is stable and always running?
>>>>
>>>> No, I'm saying that the guest should prefer the raw TSC over kvmclock if the TSC
>>>> is stable, irrespective of SNP or TDX. This is effectively already done for the
>>>> timekeeping base (see commit 7539b174aef4 ("x86: kvmguest: use TSC clocksource if
>>>> invariant TSC is exposed")), but the scheduler still uses kvmclock thanks to the
>>>> kvm_sched_clock_init() code.
>>>
>>> The kvm-clock and tsc-early both are having the rating of 299. As they are of
>>> same rating, kvm-clock is being picked up first.
>>>
>>> Is it fine to drop the clock rating of kvmclock to 298 ? With this tsc-early will
>>> be picked up instead.
>>
>> IMO, it's ugly, but that's a problem with the rating system inasmuch as anything.
>>
>> But the kernel will still be using kvmclock for the scheduler clock, which is
>> undesirable.
>
> Agree, kvm_sched_clock_init() is still being called. The above hunk was to use
> tsc-early/tsc as the clocksource and not kvm-clock.
How about the below patch:
From: Nikunj A Dadhania <nikunj@....com>
Date: Tue, 28 Nov 2023 18:29:56 +0530
Subject: [RFC PATCH] x86/kvmclock: Prefer invariant TSC as the clocksource and
scheduler clock
For platforms that support stable and always running TSC, although the
kvm-clock rating is dropped to 299 to prefer TSC, the guest scheduler clock
still keeps on using the kvm-clock which is undesirable. Moreover, as the
kvm-clock and early-tsc clocksource are both registered with 299 rating,
kvm-clock is being picked up momentarily instead of selecting more stable
tsc-early clocksource.
kvm-clock: Using msrs 4b564d01 and 4b564d00
kvm-clock: using sched offset of 1799357702246960 cycles
clocksource: kvm-clock: mask: 0xffffffffffffffff max_cycles: 0x1cd42e4dffb, max_idle_ns: 881590591483 ns
tsc: Detected 1996.249 MHz processor
clocksource: tsc-early: mask: 0xffffffffffffffff max_cycles: 0x398cadd9d93, max_idle_ns: 881590552906 ns
clocksource: Switched to clocksource kvm-clock
clocksource: tsc: mask: 0xffffffffffffffff max_cycles: 0x398cadd9d93, max_idle_ns: 881590552906 ns
clocksource: Switched to clocksource tsc
Drop the kvm-clock rating to 298, so that tsc-early is picked up before
kvm-clock and use TSC for scheduler clock as well when the TSC is invariant
and stable.
Signed-off-by: Nikunj A Dadhania <nikunj@....com>
---
The issue we see here is that on bare-metal if the TSC is marked unstable,
then the sched-clock will fall back to jiffies. In the virtualization case,
do we want to fall back to kvm-clock when TSC is marked unstable?
---
arch/x86/kernel/kvmclock.c | 11 ++++++-----
1 file changed, 6 insertions(+), 5 deletions(-)
diff --git a/arch/x86/kernel/kvmclock.c b/arch/x86/kernel/kvmclock.c
index 5b2c15214a6b..c997b2628c4b 100644
--- a/arch/x86/kernel/kvmclock.c
+++ b/arch/x86/kernel/kvmclock.c
@@ -317,9 +317,6 @@ void __init kvmclock_init(void)
if (kvm_para_has_feature(KVM_FEATURE_CLOCKSOURCE_STABLE_BIT))
pvclock_set_flags(PVCLOCK_TSC_STABLE_BIT);
- flags = pvclock_read_flags(&hv_clock_boot[0].pvti);
- kvm_sched_clock_init(flags & PVCLOCK_TSC_STABLE_BIT);
-
x86_platform.calibrate_tsc = kvm_get_tsc_khz;
x86_platform.calibrate_cpu = kvm_get_tsc_khz;
x86_platform.get_wallclock = kvm_get_wallclock;
@@ -341,8 +338,12 @@ void __init kvmclock_init(void)
*/
if (boot_cpu_has(X86_FEATURE_CONSTANT_TSC) &&
boot_cpu_has(X86_FEATURE_NONSTOP_TSC) &&
- !check_tsc_unstable())
- kvm_clock.rating = 299;
+ !check_tsc_unstable()) {
+ kvm_clock.rating = 298;
+ } else {
+ flags = pvclock_read_flags(&hv_clock_boot[0].pvti);
+ kvm_sched_clock_init(flags & PVCLOCK_TSC_STABLE_BIT);
+ }
clocksource_register_hz(&kvm_clock, NSEC_PER_SEC);
pv_info.name = "KVM";
--
2.34.1
Powered by blists - more mailing lists