[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Date: Mon, 12 Dec 2011 14:49:51 -0800
From: Arun Sharma <asharma@...com>
To: Andrew Lutomirski <luto@....edu>
CC: <linux-kernel@...r.kernel.org>, Kumar Sundararajan <kumar@...com>,
Peter Zijlstra <peterz@...radead.org>,
Ingo Molnar <mingo@...e.hu>,
Thomas Gleixner <tglx@...utronix.de>,
john stultz <johnstul@...ibm.com>
Subject: Re: [PATCH 2/2] Add a thread cpu time implementation to vDSO
On 12/12/11 12:15 PM, Andrew Lutomirski wrote:
> This seems like a neat idea, but I'm a little worried about the
> implementation. Surely someone with 4k+ cpus will complain when the
> percpu vvar data exceeds a page and crashes. I have no personal
> objection to a dynamically-sized vvar area, but it might need more
> careful handling.
I'm for disabling the vsyscall when the data doesn't fit in a couple of
pages.
>
> I also wonder if there a problem with information leaking. If there
> was an entire per-cpu vvar page (perhaps mapped differently on each
> cpu) then nothing interesting would leak. But that could add
> overhead.
>
The timing based attacks depend on the granularity of timestamps. I feel
what's available here is too coarse grained to be useful. Happy to
disable the code at compile time for those cases. Are there
CONFIG_HIGH_SECURITY type of options I could use for this purpose?
>> diff --git a/arch/x86/include/asm/vvar.h b/arch/x86/include/asm/vvar.h
>> index 0fd7a4a..e36e1c1 100644
>> --- a/arch/x86/include/asm/vvar.h
>> +++ b/arch/x86/include/asm/vvar.h
>> @@ -47,5 +47,6 @@
>> DECLARE_VVAR(0, volatile unsigned long, jiffies)
>> DECLARE_VVAR(16, int, vgetcpu_mode)
>> DECLARE_VVAR(128, struct vsyscall_gtod_data, vsyscall_gtod_data)
>> +DECLARE_VVAR(2048, struct vcpu_data, vcpu_data)
>
> Any reason this isn't page-aligned? Offset 2048 seems like an odd place.
>
I meant it to be 128 + sizeof(struct vsyscall_gtod_data) so we fit
everything in one page if CONFIG_NR_CPUS is small enough.
I'll fix it up in the next rev.
>> notrace static long vdso_fallback_gettime(long clock, struct timespec *ts)
>> {
>> long ret;
>> - asm("syscall" : "=a" (ret) :
>> - "0" (__NR_clock_gettime),"D" (clock), "S" (ts) : "memory");
>> + asm volatile("syscall" : "=a" (ret) :
>> + "0" (__NR_clock_gettime),"D" (clock), "S" (ts) : "memory");
>> return ret;
>> }
>
> Huh? This should probably be a separate patch (and probably not a
> -stable candidate, since it would take amazing compiler stupidity to
> generate anything other than the obvious code). The memory clobber
> may also be enough to make this officially safe.
Yes - this should be a separate patch. gcc-4.4 likes to get rid of the
instruction in __do_thread_cpu_time without the asm volatile (in spite
of the memory clobber).
>> + if (vp->tsc_unstable) {
>> + struct timespec ts;
>> + vdso_fallback_gettime(CLOCK_THREAD_CPUTIME_ID,&ts);
>> + return timespec_to_ns(&ts);
>> + }
>
> Yuck -- another flag indicating whether we're using the tsc.
I renamed it to thread_cputime_disabled to deal with NR_CPUS > 64.
>
>> +
>> + do {
>> + native_read_tscp(&p);
>
> Do all 64-bit cpus have rdtscp? ISTR older Intel machines don't have it.
>
Yeah - I ran into this one when testing with KVM on a rdtscp capable
CPU. Will disable the vsyscall when the CPU doesn't support rdtscp.
>> --- a/arch/x86/vdso/vdso.lds.S
>> +++ b/arch/x86/vdso/vdso.lds.S
>> @@ -25,6 +25,8 @@ VERSION {
>> __vdso_getcpu;
>> time;
>> __vdso_time;
>> + thread_cpu_time;
>> + __vdso_thread_cpu_time;
>> local: *;
>> };
>> }
>
> Why do we have the non-__vdso versions? IMO anything that actually
> uses them is likely to be confused. They have the same names as the
> glibc wrappers, but the glibc wrappers have different return value and
> errno semantics. I'd say just use __vdso.
I'm not familiar with the history of __vdso_foo vs foo. Happy to get rid
of the non __vdso versions.
>
> Also, what's wrong with just adding the functionality to __vdso_clock_gettime?
>
Performance. Most of this is client dependent (whether it expects time
in nanoseconds or timespec).
Baseline: 19.34 secs
vclock_gettime: 4.74 secs
thread_cpu_time: 3.62 secs
Our use case prefers nanoseconds, so the conversion to timespec and back
adds overhead. Also, having a direct entry point vs going through the
switch in clock_gettime() adds some cycles.
I have some people here asking me if they could get CLOCK_REALTIME and
CLOCK_MONOTONIC also in nanoseconds for the same reason.
>> +struct vcpu_data {
>> + struct vpercpu_data vpercpu[NR_CPUS];
>> + unsigned int tsc_khz;
>
> I think this also duplicates some of the gtod_data stuff, at least in
> the case that TSC works for gtod.
>
vgotd.h seems to be clock source independent. This vsyscall is usable
only on systems with tsc_unstable == 0. Also, there is only one instance
of mult and shift there, whereas we save it on a per-cpu basis in the
VVAR page.
Thanks for the review and comments!
-Arun
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
Powered by blists - more mailing lists