[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-Id: <6d2cf767-a5bb-4df4-bf9c-dcbf3bf82722@app.fastmail.com>
Date: Tue, 16 May 2023 14:53:24 -0700
From: "Andy Lutomirski" <luto@...nel.org>
To: "Rong Tao" <rtoax@...mail.com>,
"Thomas Gleixner" <tglx@...utronix.de>
Cc: "Rong Tao" <rongtao@...tc.cn>, "Ingo Molnar" <mingo@...hat.com>,
"Borislav Petkov" <bp@...en8.de>,
"Dave Hansen" <dave.hansen@...ux.intel.com>,
"the arch/x86 maintainers" <x86@...nel.org>,
"H. Peter Anvin" <hpa@...or.com>,
"Linux Kernel Mailing List" <linux-kernel@...r.kernel.org>
Subject: Re: [PATCH] x86/vdso: Use non-serializing instruction rdtsc
On Mon, May 15, 2023, at 11:52 PM, Rong Tao wrote:
> From: Rong Tao <rongtao@...tc.cn>
>
> Replacing rdtscp or 'lfence;rdtsc' with the non-serializable instruction
> rdtsc can achieve a 40% performance improvement with only a small loss of
> precision.
>
> The RDTSCP instruction is not a serializing instruction, but it does wait
> until all previous instructions have executed and all previous loads are
> globally visible. The RDTSC instruction is not a serializing instruction.
> It does not necessarily wait until all previous instructions have been
> executed before reading the counter.
>
> Record the time-consuming of vdso clock_gettime(), pseudo code:
>
> count = 1000 * 1000 * 100;
> while (count--)
> clock_gettime(CLOCK_REALTIME, &ts);
>
> Time-consuming comparison:
>
> Time Consume(ns) | rdtsc_ordered() | rdtsc() | Promote
> ------------------+-----------------+-----------+---------
> Physical Machine | 1269147289 | 759067324 | 40%
> Guest OS (KVM) | 1756615963 | 995823886 | 43%
>
> Signed-off-by: Rong Tao <rongtao@...tc.cn>
Out of curiosity, what happens if you apply that patch and run this thing:
https://git.kernel.org/pub/scm/linux/kernel/git/luto/misc-tests.git/tree/evil-clock-test.cc
Build it with g++ -O2 and run:
./evil-clock-test -c monotonic
--Andy
Powered by blists - more mailing lists