lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <547F94B0.1000902@linaro.org>
Date:	Wed, 03 Dec 2014 17:54:40 -0500
From:	David Long <dave.long@...aro.org>
To:	William Cohen <wcohen@...hat.com>,
	Masami Hiramatsu <masami.hiramatsu.pt@...achi.com>,
	Steve Capper <steve.capper@...aro.org>
CC:	"Jon Medhurst (Tixy)" <tixy@...aro.org>,
	Russell King <linux@....linux.org.uk>,
	Ananth N Mavinakayanahalli <ananth@...ibm.com>,
	Sandeepa Prabhu <sandeepa.prabhu@...aro.org>,
	Catalin Marinas <catalin.marinas@....com>,
	Will Deacon <will.deacon@....com>,
	"linux-kernel@...r.kernel.org" <linux-kernel@...r.kernel.org>,
	Anil S Keshavamurthy <anil.s.keshavamurthy@...el.com>,
	David Miller <davem@...emloft.net>,
	"linux-arm-kernel@...ts.infradead.org" 
	<linux-arm-kernel@...ts.infradead.org>
Subject: Re: [PATCH v3 0/5] ARM64: Add kernel probes(Kprobes) support

On 12/03/14 09:54, William Cohen wrote:
> On 12/01/2014 04:37 AM, Masami Hiramatsu wrote:
>> (2014/11/29 1:01), Steve Capper wrote:
>>> On 27 November 2014 at 06:07, Masami Hiramatsu
>>> <masami.hiramatsu.pt@...achi.com> wrote:
>>>> (2014/11/27 3:59), Steve Capper wrote:
>>>>> The crash is extremely easy to reproduce.
>>>>>
>>>>> I've not observed any missed events on a kprobe on an arm64 system
>>>>> that's still alive.
>>>>> My (limited!) understanding is that this suggests there could be a
>>>>> problem with how missed events from a recursive call to memcpy are
>>>>> being handled.
>>>>
>>>> I think so too. BTW, could you bisect that? :)
>>>>
>>>
>>> I can't bisect, but the following functions look suspicious to me
>>> (again I'm new to kprobes...):
>>> kprobes_save_local_irqflag
>>> kprobes_restore_local_irqflag
>>>
>>> I think these are breaking somehow when nested (i.e. from a recursive probe).
>>
>> Agreed. On x86, prev_kprobe has old_flags and saved_flags, this
>> at least must have saved_irqflag and save/restore it in
>> save/restore_previous_kprobe().
>>
>> What about adding this?
>>
>>   struct prev_kprobe {
>>   	struct kprobe *kp;
>>   	unsigned int status;
>> +	unsigned long saved_irqflag;
>>   };
>>
>> and
>>
>>   static void __kprobes save_previous_kprobe(struct kprobe_ctlblk *kcb)
>>   {
>>   	kcb->prev_kprobe.kp = kprobe_running();
>>   	kcb->prev_kprobe.status = kcb->kprobe_status;
>> +	kcb->prev_kprobe.saved_irqflag = kcb->saved_irqflag;
>>   }
>>
>>   static void __kprobes restore_previous_kprobe(struct kprobe_ctlblk *kcb)
>>   {
>>   	__this_cpu_write(current_kprobe, kcb->prev_kprobe.kp);
>>   	kcb->kprobe_status = kcb->prev_kprobe.status;
>> +	kcb->saved_irqflag = kcb->prev_kprobe.saved_irqflag;
>>   }
>>
>>
>
> I have noticed with the aarch64 kprobe patches and recent kernel I can get the machine to end up getting stuck and printing out endless strings of
>
> [187694.855843] Unexpected kernel single-step exception at EL1
> [187694.861385] Unexpected kernel single-step exception at EL1
> [187694.866926] Unexpected kernel single-step exception at EL1
> [187694.872467] Unexpected kernel single-step exception at EL1
> [187694.878009] Unexpected kernel single-step exception at EL1
> [187694.883550] Unexpected kernel single-step exception at EL1
>
> I can reproduce this pretty easily on my machine with functioncallcount.stp from https://sourceware.org/systemtap/examples/profiling/functioncallcount.stp and the following steps:
>
> # stap -p4 -k -m mm_probes -w functioncallcount.stp "*@...*.c" -c "sleep 1"
> # staprun mm_probes.ko -c "sleep 1"
>
> -Will

I did a fresh checkout and build of systemtap and tried the above.  I'm 
not yet seeing this problem.  It does remind me of the problem we saw 
before debug exception handling in entry.S was patched in v3.18-rc1, but 
you say you are using recent kernel sources.

>>
>>
>>> That would explain why the state of play of the interrupts is in an
>>> unexpected state in the crash I reported:
>>> "The point of failure in the panic was:
>>> fs/buffer.c:1257
>>>
>>> static inline void check_irqs_on(void)
>>> {
>>> #ifdef irqs_disabled
>>>          BUG_ON(irqs_disabled());
>>> #endif
>>> }
>>> "
>>>
>>> This is all new to me so I'm still at the head-scratching stage.
>>
>> Ah, I see.
>>
>> Thank you,
>>
>>>
>>> David,
>>> Does the above make sense to you? Have you managed to reproduce the crash I get?
>>>
>>> Cheers,
>>> --
>>> Steve

I have easily produced a crash although it doesn't look to me like the 
same one.  I'm getting a NULL pointer dereference.  The PMU stuff (used 
by perf record|stat -e) should be quite independent of kprobes though.

-dl


--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ