linux-kernel - Re: [PATCH 0/4] workqueue_tracepoint: Add worklet tracepoints for worklet lifecycle tracing

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Date:	Fri, 24 Apr 2009 22:00:20 -0400 (EDT)
From:	Steven Rostedt <rostedt@...dmis.org>
To:	Andrew Morton <akpm@...ux-foundation.org>
cc:	Frederic Weisbecker <fweisbec@...il.com>, zhaolei@...fujitsu.com,
	mingo@...e.hu, kosaki.motohiro@...fujitsu.com, tzanussi@...il.com,
	linux-kernel@...r.kernel.org, oleg@...hat.com
Subject: Re: [PATCH 0/4] workqueue_tracepoint: Add worklet tracepoints for
 worklet lifecycle tracing

On Fri, 24 Apr 2009, Andrew Morton wrote:

> On Sat, 25 Apr 2009 02:37:03 +0200
> Frederic Weisbecker <fweisbec@...il.com> wrote:
> 
> > I discovered it with this tracer. Then it brought me to
> > write this patch:
> > 
> > http://lkml.org/lkml/2009/1/31/184
> > 
> > ...
> > 
> > Still with these same observations, I wrote this another one:
> > 
> > http://lkml.org/lkml/2009/1/26/363
> 
> OK, it's great that you're working to improve the workqueue code.  But
> does this justify permanently adding debug code to the core workqueue
> code?  In fact, because you've discovered these problem, the reasons
> for adding the debug code have lessened!
> 
> What we need are curious developers looking into how well subsystems
> are performing and how well callers are using them.  Adding fairly
> large amounts of permanent debug code into the core subsystems is a
> peculiar way of encouraging such activity.
> 
> If a developer is motivated to improve (say) workqueues then they will
> write a bit of ad-hoc code, or poke at it with systemtap or will
> maintain a private ftrace patch - that's all pretty simple stuff for
> such people.
> 
> So what is the remaining case for adding these patches?  What I see is
> 
> a) that their presence will entice people to run them and maybe find
>    some problems and
> 
> b) the workqueue-maintainer's task is lessened a bit by not having
>    to forward-port his debugging patch.
> 
> I dunno, it all seems a bit thin to me.  Especially when you multiply
> it all by nr_core_subsystems?

I agree that we need to be frugal with the addition of trace points. But 
I don't think the bugs that can be solved with this is always reproducible 
by the developer.

If you have a distribution kernel that is running at a customers location, 
you may not have the privilege of shutting down that kernel, patching the 
code, recompiling and booting up this temporary kernel. It would be nice 
to have strategic locations in the kernel where we can easily enable a 
trace point and monitor what is going on.

If the customer calls and tells you there's some strange performance 
issues when running such and such a load, it would be nice to look at 
things like workqueues to analyze the situation.

Point being, the events are not for me on the box that runs my machines.
Hell, I had Logdev for 10 years doing that for me. But now to have 
something that is running at a customers site with extremely low overhead 
that we can enable when problems arise. That is what makes this worth 
while.

Note, when I was contracting, I even had logdev prints inside the 
production (custom) kernel that I could turn on and off. This was exactly 
for this purpose. To monitor what is happening inside the kernel when in 
the field.

-- Steve

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/