lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Date:	Tue, 13 Nov 2007 21:20:53 +0100
From:	Andi Kleen <ak@...e.de>
To:	"Siddha, Suresh B" <suresh.b.siddha@...el.com>
Cc:	linux-kernel@...r.kernel.org, mingo@...e.hu, hpa@...or.com,
	tglx@...utronix.de, markus.t.metzger@...el.com,
	akpm@...ux-foundation.org
Subject: Re: [patch] x86, ptrace: support for branch trace store(BTS)

On Tuesday 13 November 2007 20:59:54 Siddha, Suresh B wrote:
> Support for Intel's last branch recording to ptrace. This gives debuggers
> access to this hardware feature and allows them to show an execution trace
> of the debugged application.

Cool. Finally someone gets around to supporting this properly.

Ok mostly properly, the patch needs some improvements.

But can you please add a mode for tracing kernel code too? I guess
this means a way to forbid it to user space or allocate the register.
That would be probably needed by kernel debuggers too anyways, so
might be better to add it from the begging.

I'm sure some people will suggest now that should be all only
supported with utrace. My suggestion is to ignore them.

Also there was some discussion recently that complex kernel
interfaces like this need manpages on proposal so that proper review
of the interfaces is possible. I would suggest to write manpages
(or manpages patches) and point to them on the next submission.

> Index: linux-2.6/arch/x86/kernel/process_32.c
> ===================================================================
> --- linux-2.6.orig/arch/x86/kernel/process_32.c	2007-11-07 09:08:32.%N +0100
> +++ linux-2.6/arch/x86/kernel/process_32.c	2007-11-07 09:09:36.%N +0100
> @@ -735,6 +735,18 @@
>  		__switch_to_xtra(prev_p, next_p, tss);
>  
>  	/*
> +	 * Last branch recording recofiguration of trace hardware and
> +	 * disentangling of trace data per task.
> +	 */
> +	if (unlikely(task_thread_info(prev_p)->flags & _TIF_BTS_TRACE ||
> +		     task_thread_info(prev_p)->flags & _TIF_BTS_TRACE_TS))
> +		ptrace_bts_task_departs(prev_p);
> +
> +	if (unlikely(task_thread_info(next_p)->flags & _TIF_BTS_TRACE ||
> +		     task_thread_info(next_p)->flags & _TIF_BTS_TRACE_TS))
> +		ptrace_bts_task_arrives(next_p);
> +

Congratulations. You managed to pick exactly the wrong place.  
Why do you think __switch_to_xtra was invented?  The bits should be in 
CTXSW_PREV/NEXT and then the checks moved there. Then it would be zero
cost for the nobody using it case.


> +#ifdef __i386__
> +static int ptrace_bts_disable_netburst(void)
> +{
> +	unsigned long long debugctl_msr = 0;
> +	rdmsrl(MSR_IA32_DEBUGCTLMSR, debugctl_msr);
> +	debugctl_msr &=
> +		~((1 << 2) | /* TR */
> +		  (1 << 3));  /* BTS */
> +	wrmsrl(MSR_IA32_DEBUGCTLMSR, debugctl_msr);
> +
> +	return 0;
> +}
> +
> +static int ptrace_bts_disable_pentium_m(void)
> +{
> +	unsigned long long debugctl_msr = 0;
> +	rdmsrl(MSR_IA32_DEBUGCTLMSR, debugctl_msr);

etc. Surely it would be faster to cache the value of the MSR in RAM?
Might matter for the context switch.

> +	if (size_in_records > 4000)
> +		return -EINVAL;

Nasty undocumented undefined magic numbers.


> +
> +	if (!ptrace_bts_ops.allocate_bts)
> +		return -EOPNOTSUPP;
> +	return (*ptrace_bts_ops.allocate_bts)(child, size_in_records);
> +}
> +
> +#ifdef __i386__

etc. The code duplication for i386 is indeed quite ugly. Would
be good to find some better way to do this. A lot of it seems
to be just different sizeof(), surely that could be passed around too
or gotten from the ops structure? 

Also it's unclear why you got different MSR access functions for 32bit 
and 64bit.

> +static int ptrace_bts_init_intel(void)

This should be probably in the standard CPU initialization code,
possibly using synthesized feature bits too.

Also do you have user space to use this?

-Andi
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ