lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <5551F693.8050100@redhat.com>
Date:	Tue, 12 May 2015 08:48:19 -0400
From:	William Cohen <wcohen@...hat.com>
To:	David Long <dave.long@...aro.org>,
	Will Deacon <will.deacon@....com>
CC:	Masami Hiramatsu <masami.hiramatsu.pt@...achi.com>,
	"linux-arm-kernel@...ts.infradead.org" 
	<linux-arm-kernel@...ts.infradead.org>,
	Russell King <linux@....linux.org.uk>,
	"sandeepa.s.prabhu@...il.com" <sandeepa.s.prabhu@...il.com>,
	Steve Capper <steve.capper@...aro.org>,
	Catalin Marinas <Catalin.Marinas@....com>,
	"Jon Medhurst (Tixy)" <tixy@...aro.org>,
	Ananth N Mavinakayanahalli <ananth@...ibm.com>,
	Anil S Keshavamurthy <anil.s.keshavamurthy@...el.com>,
	"davem@...emloft.net" <davem@...emloft.net>,
	"linux-kernel@...r.kernel.org" <linux-kernel@...r.kernel.org>
Subject: Re: [PATCH v6 0/6] arm64: Add kernel probes (kprobes) support

On 05/12/2015 01:54 AM, David Long wrote:
> On 05/05/15 11:48, Will Deacon wrote:
>> On Tue, May 05, 2015 at 06:14:51AM +0100, David Long wrote:
>>> On 05/01/15 21:44, William Cohen wrote:
>>>> Dave Long and I did some additional experimentation to better
>>>> understand what is condition causes the kernel to sometimes spew:
>>>>
>>>> Unexpected kernel single-step exception at EL1
>>>>
>>>> The functioncallcount.stp test instruments the entry and return of
>>>> every function in the mm files, including kfree.  In most cases the
>>>> arm64 trampoline_probe_handler just determines which return probe
>>>> instance matches the current conditions, runs the associated handler,
>>>> and recycles the return probe instance for another use by placing it
>>>> on a hlist.  However, it is possible that a return probe instance has
>>>> been set up on function entry and the return probe is unregistered
>>>> before the return probe instance fires.  In this case kfree is called
>>>> by the trampoline handler to remove the return probe instances related
>>>> to the unregistered kretprobe.  This case where the the kprobed kfree
>>>> is called within the arm64 trampoline_probe_handler function trigger
>>>> the problem.
>>>>
>>>> The kprobe breakpoint for the kfree call from within the
>>>> trampoline_probe_handler is encountered and started, but things go
>>>> wrong when attempting the single step on the instruction.
>>>>
>>>> It took a while to trigger this problem with the sytemtap testsuite.
>>>> Dave Long came up with steps that reproduce this more quickly with a
>>>> probed function that is always called within the trampoline handler.
>>>> Trying the same on x86_64 doesn't trigger the problem.  It appears
>>>> that the x86_64 code can handle a single step from within the
>>>> trampoline_handler.
>>>>
>>>
>>> I'm assuming there are no plans for supporting software breakpoint debug
>>> exceptions during processing of single-step exceptions, any time soon on
>>> arm64.  Given that the only solution that I can come with for this is
>>> instead of making this orphaned kretprobe instance list exist only
>>> temporarily (in the scope of the kretprobe trampoline handler), make it
>>> always exist and kfree any items found on it as part of a periodic
>>> cleanup running outside of the handler context.  I think these changes
>>> would still all be in archiecture-specific code.  This doesn't feel to
>>> me like a bad solution.  Does anyone think there is a simpler way out of
>>> this?
>>
>> Just to clarify, is the problem here the software breakpoint exception,
>> or trying to step the faulting instruction whilst we were already handling
>> a step?
>>
> 
> Sorry for the delay, I got tripped up with some global optimizations that happened when I made more testing changes.  When the kprobes software breakpoint handler for kretprobes is reentered it sets up the single-step and that ends up hitting inside entry.S, apparently in el1_undef.
> 
>> I think I'd be inclined to keep the code run in debug context to a minimum.
>> We already can't block there, and the more code we add the more black spots
>> we end up with in the kernel itself. The alternative would be to make your
>> kprobes code re-entrant, but that sounds like a nightmare.
>>
>> You say this works on x86. How do they handle it? Is the nested probe
>> on kfree ignored or handled?
>>
> 
> Will Cohen's email pointing out x86 does not use a breakpoint for the trampoline handler explains a lot.  I'm experimenting starting with his proposed new trampoline code.  I can't see a reason this can't be made to work and so given everything it doesn't seem interesting to try and understand the failure in reentering the kprobe break handler in any more detail.
> 
> -dave long
> 
> 

Hi Dave,

In some of the previous diagnostic output it looked like things would go wrong in the entry.S when the D bit was cleared and the debug interrupts were unmasksed.  I wonder if some of the issue might be due to the starting the kprobe for the trampoline, but leaving things in an odd state when another set of krpobe/kretporbes are hit when the trampoline is running.  As Dave mentioned the proposed trampoline patch avoids using a kprobe in the trampoline and directly calls the trampoline handler.  Attached is the current version of the patch which was able to run the systemtap testsuite.  Systemtap does exercise the kprobe/kretprobe infrastructure, but it would be good to have additional raw kprobe tests to check that kprobe reentry works as expected.

-Will Cohen

View attachment "avoid_bkpt_tramp_v2.diff" of type "text/x-patch" (4739 bytes)

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ