lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20090220185625.GA3104@redhat.com>
Date:	Fri, 20 Feb 2009 13:56:25 -0500
From:	Jason Baron <jbaron@...hat.com>
To:	Ingo Molnar <mingo@...e.hu>
Cc:	Frederic Weisbecker <fweisbec@...il.com>,
	Avi Kivity <avi@...hat.com>,
	Randy Dunlap <randy.dunlap@...cle.com>,
	Steven Rostedt <rostedt@...dmis.org>,
	linux-kernel@...r.kernel.org, linux-kbuild@...r.kernel.org
Subject: Re: [PATCH] tracing/markers: make markers select tracepoints

On Fri, Feb 20, 2009 at 06:48:11PM +0100, Ingo Molnar wrote:
> * Frederic Weisbecker <fweisbec@...il.com> wrote:
> 
> > > > >  config MARKERS
> > > > >  	bool "Activate markers"
> > > > > -	depends on TRACEPOINTS
> > > > > +	select TRACEPOINTS
> > > > >  	help
> > > > >  	  Place an empty function call at each marker site. Can be
> > > > >  	  dynamically changed for a probe function.
> > > > 
> > > > but using "select" instead of "depends on" just causes the 
> > > > kind of problem that you described, whereas using "depends on" 
> > > > does follow dependency chains.
> > > 
> > > Well, as long as the secondary selects are 'expanded' along the 
> > > line of dependencies, it should still be fine. With an 
> > > increasing number of dependencies it quickly becomes ugly 
> > > though. This might be one of the cases where it works.
> > > 
> > > Eventually we should eliminate markers, their uses can either be 
> > > converted to new-style tracepoints, or to ftrace_printk().
> > > 
> > > 	Ingo
> > 
> > ftrace_printk adds more overhead if not used since it 
> > inconditionally send the trace, unless the related 
> > TRACE_ITER_PRINTK flag on ftrace is unset.
> > 
> > I don't know how markers work, but the documentation describes 
> > that a single branch check is done in case the probe is 
> > disabled.
> > 
> > With ftrace_printk, even if TRACE_ITER_PRINTK is unset, this 
> > is still one call and one branch check. So for hot callsite 
> > it's unappropriate.
> > 
> > IMHO, tracepoints are more suited to replace markers if they 
> > have to.
> 
> there's i think the KVM usecase where markers are used 
> essentially a printk()-alike flexible tracing facility.
> 
> This is how KVMTRACE looks like at the moment:
> 
> ./vmx.c:	KVMTRACE_3D(MSR_READ, vcpu, ecx, (u32)data, (u32)(data >> 32),
> ./vmx.c:	KVMTRACE_3D(MSR_WRITE, vcpu, ecx, (u32)data, (u32)(data >> 32),
> ./vmx.c:	KVMTRACE_0D(PEND_INTR, vcpu, handler);
> ./vmx.c:	KVMTRACE_3D(VMEXIT, vcpu, exit_reason, (u32)kvm_rip_read(vcpu),
> ./vmx.c:		KVMTRACE_0D(NMI, vcpu, handler);
> ./lapic.c:	KVMTRACE_1D(APIC_ACCESS, apic->vcpu, (u32)offset, handler);
> ./lapic.c:	KVMTRACE_1D(APIC_ACCESS, apic->vcpu, (u32)offset, handler);
> ./svm.c:	KVMTRACE_2D(DR_READ, vcpu, (u32)dr, (u32)val, handler);
> ./svm.c:		KVMTRACE_3D(PAGE_FAULT, &svm->vcpu, error_code,
> ./svm.c:		KVMTRACE_3D(TDP_FAULT, &svm->vcpu, error_code,
> ./svm.c:	KVMTRACE_0D(NMI, &svm->vcpu, handler);
> 
> I think this could easily be converted to a wrapper around 
> ftrace_printk() plus a "kvmtrace" ftrace plugin which activates 
> those wrapped ftrace_printk() sites via a single global 
> __read_mostly flag.
> 
> This means KVM tracing is activated via:
> 
>  echo kvmtrace > /debug/tracing/current_tracer
> 
> that's all.
> 
> Basically the conversion would look like this:
> 
> before:
> 
>         KVMTRACE_3D(MSR_READ, &svm->vcpu, ecx, (u32)data,
>                     (u32)(data >> 32), handler);
> 
> after:
> 
> 	kvm_trace("MSR_READ: %p, %08lx, %016Lx\n", &svm->vcpu, ecx, data);
> 
> As a result all these traces would become a lot more readable 
> (and a lot more flexible) both in the source code, and in the 
> trace output stage.
> 
> And any ad-hoc tracepoint can be added, without worrying about 
> the name of the macro or the number of type of arguments. Note 
> that in this specific example we didnt need to split up the u64 
> 'data' into two 32-bit values, nor do we have to pass in the 
> 'handler' name, nor do we have to provide a MSR_READ 
> enumeration.
> 
> The tracing-disabled case would still be as fast - a single 
> branch check.
> 
> Avi, what do you think, any objections against an RFC patchset 
> that shows this off?
> 

hi,

this is the kind of case that i've designed 'dynamic debug' for. It
implicitly allows fined grained control via - source file and line
number, function name, module name, and/or fomat string. If we are going
to have a bunch of these type of interfaces sprinkled throughout the
kernel, I think this fine grained control becomes important. this is all
controlled and visible via: <debugfs>/dynamic_debug/control file. In the
above case i believe matching 'kvm' module would turn all these
statements on/off. We can always add 'tags' to tie specific debug
statements together as well.

In terms of usage, currently, if you set 'CONFIG_DYNAMIC_DEBUG' all
'pr_debug' and 'dev_dbg' statements are picked up. Thus, simply
converting the above statements to use 'pr_debug()' will make them
run-time tunable if CONFIG_DYNAMIC_DEBUG is set. That's all that needs
to be done...we don't need to write an ftracer debug statements.

We could also make subsystem specific macros as you mentioned
conditional on subsystem specific CONFIG parameters. For example, if 
'CONFIG_KVM_DEBUG' is set we tie it into this infrastructure otherwise its a
no-op.

In terms of the 'backend', currently we're using 'printk'. However, i've
posted patches in to tie into ftrace_printk(). This could easily be made
toggle switch.

That said, markers do allow for callbacks, so that different parts of
the kernel can register and get called when the marker is hit. This
behavior might be more appropriate if its not simple 'printk' style
debugging.

thanks,

-Jason


--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ