linux-kernel - Re: [RFC] Fast assurate clock readable from user space and NMI handler

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Date:	Mon, 26 Feb 2007 22:54:56 -0500
From:	Mathieu Desnoyers <mathieu.desnoyers@...ymtl.ca>
To:	Daniel Walker <dwalker@...sta.com>
Cc:	mbligh@...gle.com, linux-kernel@...r.kernel.org,
	johnstul@...ibm.com, mingo@...e.hu
Subject: Re: [RFC] Fast assurate clock readable from user space and NMI handler

* Daniel Walker (dwalker@...sta.com) wrote:
> On Mon, 2007-02-26 at 17:14 -0500, Mathieu Desnoyers wrote:
> 
> 
> > For kernel and user space tracing, those small jumps are very annoying :
> > it can show, in a trace, that a fork() appears on a CPU after the first
> > schedule() of the thread on the other CPU : scheduling causality relationship
> > can become very hard to follow. This is only a sample case. Inaccuracy and
> > periodical modification of the clock time (non monotonic) can cause important
> > inaccuracy in performance tests, even on UP systems. A monotonic clock,
> > accessible from anywhere in kernel space (including NMI handler) and
> > from user space is very useful for performance analysis and, more
> > generally, for timestamping data in per cpu buffers so it can be later
> > reordered correctly.
> 
> What about adding a layer below do_gettimeofday() which just scheds the
> adjustment process? That might be reasonable .. The NMI, and userspace
> cases aren't very compelling right now, at least I'm not convinced a
> whole new timing interface is needed ..
> 
> The latency tracing system in the -rt branch modifies the gettimeofday
> facilities , I'm not sure of the correctness of it but it gets called
> from anyplace in the kernel including NMI's . 
> 
> Here's the function,
> 
> cycle_t notrace get_monotonic_cycles(void)
> {
>         cycle_t cycle_now, cycle_delta;
> 
>         /* read clocksource: */
>         cycle_now = clocksource_read(clock);
> 
>         /* calculate the delta since the last update_wall_time: */
>         cycle_delta = (cycle_now - clock->cycle_last) & clock->mask;
> 
>         return clock->cycle_last + cycle_delta;
> }
> 
> That looks safe. When converting this to nanoseconds you would still get
> the time adjustments but it would be all at once instead of in little
> increments ..
> 

ouch... if the clocksource used is the PIT on x86 :

static cycle_t pit_read(void)
{
        unsigned long flags;
        int count;
        u32 jifs;
        static int old_count;
        static u32 old_jifs;

        spin_lock_irqsave(&i8253_lock, flags);

If an NMI nests over the spinlock, we have a deadlock.

In addition, clock->cycle_last is a cycle_t, defined as a 64 bits on
x86. If is therefore not updated atomically by change_clocksource,
timekeeping_init, timekeeping_resume and update_wall_time. If an NMI
fires right on top of the update, especially around the 32 bits wrap
around, the time will be really fuzzy.

Mathieu

> Daniel
> 

-- 
Mathieu Desnoyers
Computer Engineering Ph.D. Student, Ecole Polytechnique de Montreal
OpenPGP key fingerprint: 8CD5 52C3 8E3C 4140 715F  BA06 3F25 A8FE 3BAE 9A68
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/