[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Message-ID: <4AD5F921.8080007@goop.org>
Date: Wed, 14 Oct 2009 09:15:29 -0700
From: Jeremy Fitzhardinge <jeremy@...p.org>
To: Ingo Molnar <mingo@...e.hu>
CC: Peter Zijlstra <peterz@...radead.org>, Avi Kivity <avi@...hat.com>,
Ingo Molnar <mingo@...hat.com>,
Linux Kernel Mailing List <linux-kernel@...r.kernel.org>,
Thomas Gleixner <tglx@...utronix.de>,
Andi Kleen <ak@...ux.intel.com>,
"H. Peter Anvin" <hpa@...or.com>
Subject: Re: [PATCH RFC] sched: add notifier for process migration
On 10/14/09 00:05, Ingo Molnar wrote:
> * Jeremy Fitzhardinge <jeremy@...p.org> wrote:
>
>
>> @@ -1981,6 +1989,12 @@ void set_task_cpu(struct task_struct *p, unsigned int new_cpu)
>> #endif
>> perf_swcounter_event(PERF_COUNT_SW_CPU_MIGRATIONS,
>> 1, 1, NULL, 0);
>> +
>> + tmn.task = p;
>> + tmn.from_cpu = old_cpu;
>> + tmn.to_cpu = new_cpu;
>> +
>> + atomic_notifier_call_chain(&task_migration_notifier, 0, &tmn);
>>
> We already have one event notifier there - look at the
> perf_swcounter_event() callback. Why add a second one for essentially
> the same thing?
>
> We should only put a single callback there - a tracepoint defined via
> TRACE_EVENT() - and any secondary users can register a callback to the
> tracepoint itself.
>
> There's many similar places in the kernel - with notifier chains and
> also with a need to get tracepoints there. The fastest (and most
> consistent) solution is to add just a single event callback facility.
>
My specific use case for this notifier is to provide a "you've been
migrated" counter to usermode via a fixmap page, as part of the work to
extend kernel/pvclock.c to implement vread for vsyscall use. I probably
should have referred to that explicitly in the comment for the patch to
give a concrete motivation and rationale.
This means that on applicable systems - ie, running virtualized under
Xen or KVM - this will be something that will be installed early in boot
and called for the entire uptime of the system. Since we don't want a
strong permanent coupling between that particular piece of
arch-independent scheduler code and an arch-specific piece of
functionality, it seemed like a notifier is a good fit.
(Note that this callback is generally useful on all systems for the
vgetcpu vsyscall; it would allow us to use the "tcache" parameter to
provide results which are both fast and 100% accurate, by deferring the
use of expensive lsl/rdtscp instructions until it *knows* the cpu has
changed.)
I tend to view the intent of tracepoints as more a diagnostic tool which
are inserted and removed dynamically as a way of instrumenting a running
system, and the tracepoints themselves don't have side-effects required
for correct running of the system.
More handwavingly, I see the semantics of a tracepoint is basically a
flag-fall showing that a particular piece of kernel code has been
called, whereas notifications are that a particular event has occurred
(which may not be associated with any specific piece of code being
executed). This notion of "task X has been migrated from cpu A to B"
seems like a fairly high-level concept; the fact that it can be
implemented by hooking a single piece of code is side-effect of the
modularity of the scheduler rather than anything relating to the event
itself.
Functionally, tracepoints and notifiers do have broad similarities.
Should they be unified? I don't know, but they do seem to serve
distinct roles.
J
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
Powered by blists - more mailing lists