linux-kernel - Re: [PATCH 0/4] workqueue_tracepoint: Add worklet tracepoints for worklet lifecycle tracing

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <20090426104747.GA5983@elte.hu>
Date:	Sun, 26 Apr 2009 12:47:47 +0200
From:	Ingo Molnar <mingo@...e.hu>
To:	Andrew Morton <akpm@...ux-foundation.org>
Cc:	Frederic Weisbecker <fweisbec@...il.com>, zhaolei@...fujitsu.com,
	kosaki.motohiro@...fujitsu.com, rostedt@...dmis.org,
	tzanussi@...il.com, linux-kernel@...r.kernel.org, oleg@...hat.com
Subject: Re: [PATCH 0/4] workqueue_tracepoint: Add worklet tracepoints for
	worklet lifecycle tracing

* Andrew Morton <akpm@...ux-foundation.org> wrote:

> On Sat, 25 Apr 2009 02:37:03 +0200
> Frederic Weisbecker <fweisbec@...il.com> wrote:
> 
> > I discovered it with this tracer. Then it brought me to
> > write this patch:
> > 
> > http://lkml.org/lkml/2009/1/31/184
> > 
> > ...
> > 
> > Still with these same observations, I wrote this another one:
> > 
> > http://lkml.org/lkml/2009/1/26/363
> 
> OK, it's great that you're working to improve the workqueue code.  
> But does this justify permanently adding debug code to the core 
> workqueue code? [...]

Andrew - but this is not what you asked originally. Here's the 
exchange, not cropped:

> > > So this latest patchset provides all these required 
> > > informations on the events tracing level.
> >  Well.. required by who?
> >
> > I don't recall ever seeing any problems of this nature, nor 
> > patches to solve any such problems.

And Frederic replied that there's three recent examples of various 
patches and problem reports resulting out of the workqueue 
tracepoints.

Now you argue 'yes, there might have been an advantage but it's not 
permanent' - which appears to be a somewhat shifting position 
really. I dont think _our_ position has shifted in any way - please 
correct me if i'm wrong ;-)

And i'm, as the original author of the kernel/workqueue.c code 
(heck, i even coined the 'workqueue' term - if that matters) agree 
with Frederic here: more transparency in seeing what goes on in a 
subsystem brings certain advantages:

 - it spurs development
 - it helps the fixing of bugs
 - and generally it helps people understand the kernel better

weighed against the cost of maintaining (and seeing) those 
tracepoints.

In the scheduler we have more than 60 distinct points of 
instrumentation.

The patches we are discussing here add 6 new tracepoints to 
kernel/workqueue.c - and i'd argue they are pretty much the maximum 
we'd ever want to have there.

I've been maintaining the scheduler instrumentation for years, and 
its overhead is, in hindsight, rather low - and the advantage is 
significant. As long as tracing and statistics instrumentation has a 
very standard and low-key "function call" visual form, i dont even 
notice them most of the time.

And the thing is, the workqueue code has been pretty problematic 
lately - with lockups and other regressions. It's a pretty 'opaque' 
facility that _hides_ what goes on in it - so more transparency 
might be a good answer just on that basis alone.

> [...]  In fact, because you've discovered these problem, the 
> reasons for adding the debug code have lessened!
> 
> What we need are curious developers looking into how well 
> subsystems are performing and how well callers are using them.  
> Adding fairly large amounts of permanent debug code into the core 
> subsystems is a peculiar way of encouraging such activity.

but this - which you call peculiar - is exactly what happened when 
the first set of tracepoints were added.

Secondly, if we discount the (fairly standard) off-site tracepoints, 
is not "large amount of debug code" - the tracepoints are completely 
off site and are not a worry as long as the tracepoint arguments are 
kept intact. The bits in kernel/workqueue.c are on the 26 lines flux 
range:

 workqueue.c             |   26 ++++++++++++++++++++------

	Ingo
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/