linux-kernel - Re: [patch for 2.6.26 0/7] Architecture Independent Markers

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <20080328133301.GA21660@elte.hu>
Date:	Fri, 28 Mar 2008 14:33:01 +0100
From:	Ingo Molnar <mingo@...e.hu>
To:	Mathieu Desnoyers <mathieu.desnoyers@...ymtl.ca>
Cc:	akpm@...ux-foundation.org, linux-kernel@...r.kernel.org,
	Linus Torvalds <torvalds@...ux-foundation.org>
Subject: Re: [patch for 2.6.26 0/7] Architecture Independent Markers

* Mathieu Desnoyers <mathieu.desnoyers@...ymtl.ca> wrote:

>  6a5:   89 5c 24 14             mov    %ebx,0x14(%esp)
>  6a9:   8b 55 d0                mov    -0x30(%ebp),%edx
>  6ac:   89 54 24 10             mov    %edx,0x10(%esp)
>  6b0:   89 4c 24 0c             mov    %ecx,0xc(%esp)
>  6b4:   c7 44 24 08 f7 04 00    movl   $0x4f7,0x8(%esp)
>  6bb:   00 
>  6bc:   c7 44 24 04 00 00 00    movl   $0x0,0x4(%esp)
>  6c3:   00 
>  6c4:   c7 04 24 00 00 00 00    movl   $0x0,(%esp)
>  6cb:   ff 15 0c 00 00 00       call   *0xc
>  6d1:   e9 c3 fc ff ff          jmp    399 <schedule+0x130>
> 
> Which adds an extra 50 bytes.

you talk about 32-bit while i talk about 64-bit. All these costs go up 
on 64-bit and you should know that. I measured 44 bytes in the fastpath 
and 52 bytes in the slowpath, which gives 96 bytes. (with a distro 
.config and likely with a different gcc)

96 bytes _per marker_ sprinkled throughout the kernel. This blows up the 
cache footprint of the kernel quite substantially, because it's all 
fragmented - even if this is in the 'slowpath'.

so yes, that is the bloat i'm talking about.

dont just compare it to ftrace-sched-switch, compare it to dyn-ftrace 
which gives us more than 78,000 trace points in the kernel _here and 
today_ at no measurable runtime cost, with a 5 byte NOP per trace point 
and _zero_ instruction stream (register scheduling, etc.) intrusion. No 
slowpath cost.

and the basic API approach of markers is flawed a well - the coupling to 
the kernel is too strong. The correct and long-term maintainable 
coupling is via ASCII symbol names, not via any binding built into the 
kernel.

With dyn-ftrace (see sched-devel.git/latest) tracing filters can be 
installed trivially by users, via function _symbols_, via:

  /debugfs/tracing/available_filter_functions
  /debugfs/tracing/set_ftrace_filter

wildcards are recognized as well, so if you do:

  echo '*lock' > /debugfs/tracing/set_ftrace_filter

all functions that have 'lock' in their name will have their tracepoints 
activated transparently from that point on.

even multiple names can be passed in at once:

  echo 'schedule wake_up* *acpi*' > /debugfs/tracing/set_ftrace_filter

so it's trivial to use it, very powerful and we've only begun exposing 
it towards users. I see no good reason why we'd patch any marker into 
the kernel - it's a maintenance cost from that point on.

so yes, my argument is: tens of thousands of lightweight tracepoints in 
the kernel here and today, which are configurable via function names, 
each of which can be turned on and off individually, and none of which 
needs any source code level changes - is an obviously superior approach.

	Ingo
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/