linux-kernel - Re: [PATCH 2/4] ftrace - add function

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <20091210183508.GB17986@elte.hu>
Date:	Thu, 10 Dec 2009 19:35:08 +0100
From:	Ingo Molnar <mingo@...e.hu>
To:	"Frank Ch. Eigler" <fche@...hat.com>
Cc:	Steven Rostedt <rostedt@...dmis.org>,
	Frederic Weisbecker <fweisbec@...il.com>,
	Tim Bird <tim.bird@...sony.com>,
	Andrew Morton <akpm@...ux-foundation.org>,
	Peter Zijlstra <a.p.zijlstra@...llo.nl>,
	Arnaldo Carvalho de Melo <acme@...hat.com>,
	Li Zefan <lizf@...fujitsu.com>,
	Thomas Gleixner <tglx@...utronix.de>,
	linux kernel <linux-kernel@...r.kernel.org>
Subject: Re: [PATCH 2/4] ftrace - add function_duration tracer

* Frank Ch. Eigler <fche@...hat.com> wrote:

> Hi -
> 
> > > FWIW, those who want to collect such measurements today can do so 
> > > with a few lines of systemtap script for each of the above.
> > 
> > Well, i dont think stap can do workload instrumentation. It can do 
> > system-wide (and user local / task local) - but can it do per task 
> > hierarchies?
> 
> It can track the evolution of task hierarchies by listening to process 
> forking events, and filter other kernel/user events according to 
> then-current hierarchy data.  One primitive implementation of this is 
> in the target_set.stp tapset, but it's easy to script up other 
> policies.

target_set.stp is not really adequate. Have you actually _tried_ to use 
it on something real like hackbench, which runs thousands (or tens of 
thousands) of tasks? You'll soon find that associative arrays are not 
really adequate for that ...

Another problem i can see is that target_set.stp starts with:

   global _target_set # map: target-set-pid -> ancestor-pid

see that 'global' thing? It's a system global variable - i.e. you cannot 
measure two task hierarchies at once.

> > Also, i dont think stap supports proper separation of per workload 
> > measurements either. I.e. can you write a script that will work 
> > properly even if multiple monitoring tools are running, each trying 
> > to measure latencies?
> 
> Sure, always has.  You can run many scripts concurrently, each with 
> its own internal state.  (Overheads accumulate, sadly & naturally.)

To measure latencies you need two probes, a start and a stop one. How do 
you define a local variable that is visible to those two probes? You 
have to create a global variable - but that will/can clash with other 
instances.

( Also, you dont offer per application channels/state from the same 
  script. Each app has to define their own probes, duplicating the 
  script and increasing probe chaining overhead. )

The whole state sharing and eventing model of SystemTap is poorly 
thought out.

> > Also, i personally find built-in kernel functionality more trustable 
> > than dynamically built stap kernel modules that get inserted.
> 
> I understand.  In the absence of a suitable bytecode engine in the 
> kernel, this was the only practical way to do everything we needed.

You seem to be under the mistaken assumption that your course of action 
with SystemTap is somehow limited by what is available (or not) in the 
upstream kernel.

In reality you can implement anything you want (in fact you did 
precisely that - _against_ repeated advice of upstream kernel 
developers), and if it's good, it will be merged - simple as that. It 
might take years, but once you deliver the proof (which comes in form of 
lots of happy users/developers), it happens.

So saying 'but the kernel does not have a bytecode interpreter' (or any 
other excuse) is pretty lame.

	Ingo
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/