linux-kernel - Re: [PATCH 1/2] tracing/function-return-tracer: Make the function return tracer lockless

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <20081113185723.GC17851@elte.hu>
Date:	Thu, 13 Nov 2008 19:57:24 +0100
From:	Ingo Molnar <mingo@...e.hu>
To:	Frédéric Weisbecker <fweisbec@...il.com>
Cc:	Steven Rostedt <rostedt@...dmis.org>,
	Linux Kernel <linux-kernel@...r.kernel.org>,
	Peter Zijlstra <a.p.zijlstra@...llo.nl>
Subject: Re: [PATCH 1/2] tracing/function-return-tracer: Make the function
	return tracer lockless


* Frédéric Weisbecker <fweisbec@...il.com> wrote:

> 2008/11/13 Ingo Molnar <mingo@...e.hu>:
> > "prev_global_time" also acts as a global serializer: it ensures that
> > events are timestamped in a monotonic and ordered way.
> >
> > i.e. something like this (pseudocode, without the cmpxchg):
> >
> >  u64 prev_global_time;
> >
> >  DEFINE_PER_CPU(prev_local_time);
> >
> >  u64 global_time()
> >  {
> >        u64 now, delta, now_global;
> >
> >        prev_global = prev_global_time;
> >        now = sched_clock();
> >        delta = now - per_cpu(prev_local_time, this_cpu);
> >        per_cpu(prev_local_time, this_cpu) = now;
> >
> >        now_global = prev_global + delta;
> >        prev_global = now_global;
> >
> >        return now_global;
> >  }
> >
> > note how we build "global time" out of "local time".
> >
> > The cmpxchg would be used to put the above one into a loop, and
> > instead of updating the global time in a racy way:
> >
> >        prev_global = now_global;
> >
> > We'd update it via the cmpxchg:
> >
> >        atomic64_t prev_global_time;
> >
> >        ...
> >
> >        while (atomic64_cmpxchg(&prev_global_time,
> >                                 prev_global, now_global) != prev_global) {
> >                [...]
> >        }
> >
> > To make sure the global time goes monotonic. (this way we also avoid a
> > spinlock - locks are fragile for instrumentation)
> 
> Ok, I understand better.
> But consider the following:
> 
>  u64 global_time()
>  {
>        u64 now, delta, now_global;
>        prev_global = prev_global_time;
> 
>        while (atomic64_cmpxchg(&prev_global_time,
>                                  prev_global, now_global) != prev_global) {
> 
>            now = sched_clock();
>            delta = now - per_cpu(prev_local_time, this_cpu);
>            per_cpu(prev_local_time, this_cpu) = now;
>            now_global = prev_global + delta;
>            prev_global = now_global;
>        }
>        return now_global;
>  }
> 
> Sarting with prev_global_time = 0 If we have two cpu and the above 
> function is executed 5 times on the first cpu. We couldl have 
> per_cpu(prev_local_time) = 50 for example. And so prev_global_time 
> will be equal to 50.
> 
> Just after that, almost at the same time, cpu2 calls global_time()
> 
> delta will be equal to 50 (sched_clock() - per_cpu(prev_local_time) 
> which is 0) and prev_global_time will be 50 + 50 = 100. This is not 
> consistent. I don't know where but I'm pretty sure I missed 
> something....

you are right - it needs a bit more logic.

I think the simplest would be something like this:

 atomic64_t global_clock = INIT_ATOMIC64(0);

 u64 global_time()
 {
	u64 now, delta, now_global, prev_global;

	do {
		prev_global = atomic64_read(&global_clock);
		now = cpu_clock(raw_smp_processor_id());

		if ((s64)(now - prev_global) < 0) {
			now = prev_global;
			break;
		}
	} while (atomic64_cmpxchg(&global_clock,
				   prev_global, now) != prev_global);
 
	return now;
 }

This is the simplest way of implementing monotonic time: we only allow 
global_clock to go forwards. If all cpu_clock()s are perfectly in 
sync, we've got no problem: then "now - prev_global" will never be 
negative and we can return the local clock as the latest global time.

But if one of the CPU clocks is "behind", the function returns the 
latest global time up until the local clock catches up. Time wont be 
allowed to jump around by going back. If the clock is behind for a 
long time, then we get a lot of timestamps with the same value - that 
will be very visible in the trace and we'll then work in improving the 
cpu_clock() implementation.

So i think we could start with this simplest approach, and see how 
often we get the same timestamp for a long time (indication of the 
clocks being not perfectly in sync).

	Ingo
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/