[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-Id: <20090414102802.C637.A69D9226@jp.fujitsu.com>
Date: Tue, 14 Apr 2009 10:43:21 +0900 (JST)
From: KOSAKI Motohiro <kosaki.motohiro@...fujitsu.com>
To: Frederic Weisbecker <fweisbec@...il.com>
Cc: kosaki.motohiro@...fujitsu.com, Zhaolei <zhaolei@...fujitsu.com>,
Steven Rostedt <rostedt@...dmis.org>,
Tom Zanussi <tzanussi@...il.com>, Ingo Molnar <mingo@...e.hu>,
linux-kernel@...r.kernel.org, Oleg Nesterov <oleg@...hat.com>,
Andrew Morton <akpm@...ux-foundation.org>
Subject: Re: [PATCH v2 3/4] ftrace: add max execution time mesurement to workqueue tracer
Hi Frederic,
very thanks for good reviewing.
> > @@ -85,6 +90,29 @@ found:
> > spin_unlock_irqrestore(&workqueue_cpu_stat(cpu)->lock, flags);
> > }
> >
> > +static void probe_workqueue_exit(struct task_struct *wq_thread,
> > + struct work_struct *work)
> > +{
> > + int cpu = cpumask_first(&wq_thread->cpus_allowed);
> > + struct cpu_workqueue_stats *node, *next;
> > + unsigned long flags;
> > +
> > + spin_lock_irqsave(&workqueue_cpu_stat(cpu)->lock, flags);
> > + list_for_each_entry_safe(node, next, &workqueue_cpu_stat(cpu)->list,
> > + list) {
>
>
> Do you need the safe version here? You don't seem to remove any entry.
Yes, this is just stupid cut&paste mistake ;)
Will fix.
> Sidenote: only the workqueue destruction handler might need it if I'm
> not wrong.
> I placed some of them in other places in this file because I misunderstood the
> kernel list concepts in the past :)
> (Heh, and probably still today).
>
>
>
> > + if (node->pid == wq_thread->pid) {
> > + u64 start = node->handler_start_time;
> > + u64 executed_time = trace_clock_global() - start;
> > +
> > + if (node->max_executed_time < executed_time)
> > + node->max_executed_time = executed_time;
> > + goto found;
> > + }
> > + }
> > +found:
> > + spin_unlock_irqrestore(&workqueue_cpu_stat(cpu)->lock, flags);
> > +}
> > +
> > /* Creation of a cpu workqueue thread */
> > static void probe_workqueue_creation(struct task_struct *wq_thread, int cpu)
> > {
> > @@ -195,6 +223,9 @@ static int workqueue_stat_show(struct se
> > int cpu = cws->cpu;
> > struct pid *pid;
> > struct task_struct *tsk;
> > + unsigned long long exec_time = ns2usecs(cws->max_executed_time);
> > + unsigned long exec_usec_rem = do_div(exec_time, USEC_PER_SEC);
> > + unsigned long exec_secs = (unsigned long)exec_time;
> >
> > spin_lock_irqsave(&workqueue_cpu_stat(cpu)->lock, flags);
> > if (&cws->list == workqueue_cpu_stat(cpu)->list.next)
> > @@ -205,8 +236,11 @@ static int workqueue_stat_show(struct se
> > if (pid) {
> > tsk = get_pid_task(pid, PIDTYPE_PID);
> > if (tsk) {
> > - seq_printf(s, "%3d %6d %6u %s\n", cws->cpu,
> > + seq_printf(s, "%3d %6d %6u %5lu.%06lu"
> > + " %s\n",
> > + cws->cpu,
> > atomic_read(&cws->inserted), cws->executed,
> > + exec_secs, exec_usec_rem,
>
> You are measuring the latency from a workqueue thread point of view.
> While I find the work latency measurement very interesting,
> I think this patch does it in the wrong place. The _work_ latency point of view
> seems to me much more rich as an information source.
>
> There are several reasons for that.
>
> Indeed this patch is useful for workqueues that receive always the same work
> to perform so that you can find very easily the guilty worklet.
> But the sense of this design is lost once we consider the workqueue threads
> that receive random works. Of course the best example is events/%d
> One will observe the max latency that happened on event/0 as an exemple but
> he will only be able to feel a silent FUD because he has no way to find
> which work caused this max latency.
May I explain my expected usage scenario?
firstly, the developer check this stastics and nortice strage result. secondly
the developer monitor workqueue activety by event-tracer.
(it provide per work activety, maybe event filter feature is useful)
Yes, I have to agree my last patch description is too poor.
but I think my expected scenario is't so insane.
Next, I hope to explain why I don't choice adding per work stastics.
struct work can put in stack and it's short lived object.
then, it isn't proper "stastics" target.
I like my approach or histogram approach (suggested by ingo).
May I ask your feeling to my usage scenario?
>
>
> Especially the events/%d latency measurement seems to me very important
> because a single work from a random driver can propagate its latency
> all over the system.
>
> A single work that consumes too much cpu time, waits for long coming
> events, sleeps too much, tries to take too often contended locks, or
> whatever... such single work may delay all pending works in the queue
> and the only max latency for a given workqueue is not helpful to find
> these culprits.
>
> Having this max latency snapshot per work and not per workqueue thread
> would be useful for every kind of workqueue latency instrumentation:
>
> - workqueues with single works
> - workqueue with random works
>
> A developer will also be able to measure its own worklet action and
> find if it takes too much time, even if it isn't the worst worklet in
> the workqueue to cause latencies.
>
> The end result would be to have a descending latency sort of works
> per cpu workqueue threads (or better: per workqueue group).
>
> What do you think?
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
Powered by blists - more mailing lists