lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Date:	Wed, 03 Mar 2010 16:11:51 -0500
From:	Masami Hiramatsu <mhiramat@...hat.com>
To:	Peter Zijlstra <peterz@...radead.org>
CC:	mingo@...e.hu, linux-kernel@...r.kernel.org, paulus@...ba.org,
	eranian@...gle.com, robert.richter@....com, fweisbec@...il.com
Subject: Re: [RFC][PATCH 10/11] perf, x86: use LBR for PEBS IP+1 fixup

Peter Zijlstra wrote:
> On Wed, 2010-03-03 at 13:05 -0500, Masami Hiramatsu wrote:
>> Peter Zijlstra wrote:
>>> PEBS always reports the IP+1, that is the instruction after the one
>>> that got sampled, cure this by using the LBR to reliably rewind the
>>> instruction stream.
>>
>> Hmm, does PEBS always report one byte after the end address of the
>> sampled instruction? Or the instruction which will be executed next
>> step?
> 
> The next instruction, its trap like.
> 
>> [...]
>>> +#include <asm/insn.h>
>>> +
>>> +#define MAX_INSN_SIZE	16
>>
>> Hmm, we'd better integrate these kinds of definitions into
>> asm/insn.h... (several features define it)
> 
> Agreed, I'll look at doing a patch to collect them all into asm/insn.h
> if nobody beats me to it :-)

At least kprobes doesn't :)

>>> +
>>> +static void intel_pmu_pebs_fixup_ip(struct pt_regs *regs)
>>> +{
>>> +#if 0
>>> +	/*
>>> +	 * Borken, makes the machine expode at times trying to
>>> +	 * derefence funny userspace addresses.
>>> +	 *
>>> +	 * Should we always fwd decode from @to, instead of trying
>>> +	 * to rewind as implemented?
>>> +	 */
>>> +
>>> +	struct cpu_hw_events *cpuc = &__get_cpu_var(cpu_hw_events);
>>> +	unsigned long from = cpuc->lbr_entries[0].from;
>>> +	unsigned long to = cpuc->lbr_entries[0].to;
>>
>> Ah, I see. For branch instruction case, we can use LBR to
>> find previous IP...
> 
> Right, we use the LBR to find the basic block.

Hm, that's a good idea :)

>>> +	unsigned long ip = regs->ip;
>>> +	u8 buf[2*MAX_INSN_SIZE];
>>> +	u8 *kaddr;
>>> +	int i;
>>> +
>>> +	if (from && to) {
>>> +		/*
>>> +		 * We sampled a branch insn, rewind using the LBR stack
>>> +		 */
>>> +		if (ip == to) {
>>> +			regs->ip = from;
>>> +			return;
>>> +		}
>>> +	}
>>> +
>>> +	if (user_mode(regs)) {
>>> +		int bytes = copy_from_user_nmi(buf,
>>> +				(void __user *)(ip - MAX_INSN_SIZE),
>>> +				2*MAX_INSN_SIZE);
>>> +
>>
>> maybe, you'd better check the source address range is within
>> the user address range. e.g. ip < MAX_INSN_SIZE. 
> 
> Not only that, I realized user_mode() checks regs->cs, which is not set
> by the PEBS code, so I added some helpers.
> 
>>> +
>>> +	/*
>>> +	 * Try to find the longest insn ending up at the given IP
>>> +	 */
>>> +	for (i = MAX_INSN_SIZE; i > 0; i--) {
>>> +		struct insn insn;
>>> +
>>> +		kernel_insn_init(&insn, kaddr + MAX_INSN_SIZE - i);
>>> +		insn_get_length(&insn);
>>> +		if (insn.length == i) {
>>> +			regs->ip -= i;
>>> +			return;
>>> +		}
>>> +	}
>>
>> Hmm, this will not work correctly on x86, since the decoder can
>> miss-decode the tail bytes of previous instruction as prefix bytes. :(
>>
>> Thus, if you want to rewind instruction stream, you need to decode
>> a function (or basic block) entirely.
> 
> Something like the below?

Great! it looks good to me.
Yeah, LBR.to may always smaller than current ip (if no one disabled LBR).

Thank you,

> 
> #ifdef CONFIG_X86_32
> static bool kernel_ip(unsigned long ip)
> {
>         return ip > TASK_SIZE;
> }
> #else
> static bool kernel_ip(unsigned long ip)
> {
>         return (long)ip < 0;
> }
> #endif
> 
> static int intel_pmu_pebs_fixup_ip(unsigned long *ipp)
> {
>         struct cpu_hw_events *cpuc = &__get_cpu_var(cpu_hw_events);
>         unsigned long from = cpuc->lbr_entries[0].from;
>         unsigned long old_to, to = cpuc->lbr_entries[0].to;
>         unsigned long ip = *ipp;
>         int i;
> 
>         /*
>          * We don't need to fixup if the PEBS assist is fault like
>          */
>         if (!x86_pmu.intel_perf_capabilities.pebs_trap)
>                 return 0;
> 
>         if (!cpuc->lbr_stack.nr || !from || !to)
>                 return 0;
> 
>         if (ip < to)
>                 return 0;
> 
>         /*
>          * We sampled a branch insn, rewind using the LBR stack
>          */
>         if (ip == to) {
>                 *ipp = from;
>                 return 1;
>         }
> 
>         do {
>                 struct insn insn;
>                 u8 buf[MAX_INSN_SIZE];
>                 void *kaddr;
> 
>                 old_to = to;
>                 if (!kernel_ip(ip)) {
>                         int bytes = copy_from_user_nmi(buf, (void __user *)to,
>                                         MAX_INSN_SIZE);
> 
>                         if (bytes != MAX_INSN_SIZE)
>                                 return 0;
> 
>                         kaddr = buf;
>                 } else kaddr = (void *)to;
> 
>                 kernel_insn_init(&insn, kaddr);
>                 insn_get_length(&insn);
>                 to += insn.length;
>         } while (to < ip);
> 
>         if (to == ip) {
>                 *ipp = old_to;
>                 return 1;
>         }
> 
>         return 0;
> }
> 
> I thought about exposing the success of this fixup as a PERF_RECORD_MISC
> bit.
> 

-- 
Masami Hiramatsu
e-mail: mhiramat@...hat.com
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ