lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite for Android: free password hash cracker in your pocket
[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Date:   Wed, 21 Aug 2019 10:54:28 +0200
From:   Vitaly Kuznetsov <vkuznets@...hat.com>
To:     Michael Kelley <mikelley@...rosoft.com>,
        Tianyu Lan <lantianyu1986@...il.com>
Cc:     Peter Zijlstra <peterz@...radead.org>,
        Tianyu Lan <Tianyu.Lan@...rosoft.com>,
        "linux-arch\@vger.kernel.org" <linux-arch@...r.kernel.org>,
        "linux-hyperv\@vger.kernel.org" <linux-hyperv@...r.kernel.org>,
        "linux-kernel\@vger kernel org" <linux-kernel@...r.kernel.org>,
        Andy Lutomirski <luto@...nel.org>,
        Thomas Gleixner <tglx@...utronix.de>,
        Ingo Molnar <mingo@...hat.com>, Borislav Petkov <bp@...en8.de>,
        "H. Peter Anvin" <hpa@...or.com>,
        the arch/x86 maintainers <x86@...nel.org>,
        KY Srinivasan <kys@...rosoft.com>,
        Haiyang Zhang <haiyangz@...rosoft.com>,
        Stephen Hemminger <sthemmin@...rosoft.com>,
        Sasha Levin <sashal@...nel.org>,
        Daniel Lezcano <daniel.lezcano@...aro.org>,
        Arnd Bergmann <arnd@...db.de>
Subject: RE: [PATCH 0/2] clocksource/Hyper-V: Add Hyper-V specific sched clock function

Vitaly Kuznetsov <vkuznets@...hat.com> writes:

> Michael Kelley <mikelley@...rosoft.com> writes:
>
>> I talked to KY Srinivasan for any history about TSC page on 32-bit.  He said
>> there was no technical reason not to implement it, but our focus was always
>> 64-bit Linux, so the 32-bit was much less important.  Also, on 32-bit Linux,
>> the required 64x64 multiply and shift is more complex and takes more
>> more cycles (compare 32-bit implementation of mul_u64_u64_shr vs.
>> the 64-bit implementation), so the win over a MSR read is less.  I
>> don't know of any actual measurements being made to compare vs.
>> MSR read.
>
> VMExit is 1000 CPU cycles or so, I would guess that TSC page
> calculations are better. Let me try to build 32bit kernel and do some
> quick measurements.

So I tried and the difference is HUGE.

For in-kernel clocksource reads (like sched_clock()), the testing code
was:

        before = rdtsc_ordered();
        for (i = 0; i < 1000; i++)
             (void)read_hv_sched_clock_msr();
        after = rdtsc_ordered();
        printk("MSR based clocksource: %d cycles\n", ((u32)(after - before))/1000);

        before = rdtsc_ordered();
        for (i = 0; i < 1000; i++)
            (void)read_hv_sched_clock_tsc();
        after = rdtsc_ordered();
        printk("TSC page clocksource: %d cycles\n", ((u32)(after - before))/1000);

The result (WS2016) is:
[    1.101910] MSR based clocksource: 3361 cycles
[    1.105224] TSC page clocksource: 49 cycles

For userspace reads the absolute difference is even bigger as TSC page
gives us functional vDSO:

Testing code:
	before = rdtsc();
	for (i = 0; i < COUNT; i++)
		clock_gettime(CLOCK_REALTIME, &tp);
	after = rdtsc();
	printf("%d\n", (after - before)/COUNT);

Result:

TSC page:
# ./gettime_cycles 
131

MSR:
# ./gettime_cycles 
5664

With all that I see no reason for us to not enable TSC page on 32bit,
even if the number of users is negligible, this will allow us to get rid
of ugly #ifdef CONFIG_HYPERV_TSCPAGE in the code.

I'll send a patch for discussion.

-- 
Vitaly

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ