lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20131008075816.GA6346@gmail.com>
Date:	Tue, 8 Oct 2013 09:58:16 +0200
From:	Ingo Molnar <mingo@...nel.org>
To:	Linus Torvalds <torvalds@...ux-foundation.org>
Cc:	Fengguang Wu <fengguang.wu@...el.com>,
	Russell King - ARM Linux <linux@....linux.org.uk>,
	xen-devel@...ts.xenproject.org,
	Linux Kernel Mailing List <linux-kernel@...r.kernel.org>,
	Greg Kroah-Hartman <gregkh@...uxfoundation.org>
Subject: Re: [xen] double fault: 0000 [#1] PREEMPT SMP DEBUG_PAGEALLOC


* Linus Torvalds <torvalds@...ux-foundation.org> wrote:

> On Mon, Oct 7, 2013 at 1:35 AM, Fengguang Wu <fengguang.wu@...el.com> wrote:
> > On Mon, Oct 07, 2013 at 01:12:17AM -0700, Linus Torvalds wrote:
> >
> > My pleasure! Here are 100 randomly selected call traces. Also attached
> > several full dmesgs and the kconfig.
> 
> Ok, they may be randomly selected, but they are all the same. Which is
> good, I guess, we're only talking about one bug.
> 
> Anyway, they all have RIP:run_timer_softirq+0x12c/0x1b8, and the code is
> 
>    0: 8b 65 c8             mov    -0x38(%rbp),%esp
>    3: 4d 39 ec             cmp    %r13,%r12
>    6: 0f 84 2f ff ff ff     je     0xffffffffffffff3b
>    c: 41 8b 4c 24 18       mov    0x18(%r12),%ecx
>   11: 4d 8b 74 24 20       mov    0x20(%r12),%r14
>   16: 4d 8b 7c 24 28       mov    0x28(%r12),%r15
>   1b: 4c 89 63 38           mov    %r12,0x38(%rbx)
>   1f: 49 8b 44 24 08       mov    0x8(%r12),%rax
>   24: 49 8b 14 24           mov    (%r12),%rdx
>   28: 83 e1 02             and    $0x2,%ecx
>   2b:* 48 89 42 08           mov    %rax,0x8(%rdx) <-- trapping instruction
>   2f: 48 89 10             mov    %rdx,(%rax)
>   32: 48 b8 00 02 20 00 00 movabs $0xdead000000200200,%rax
> 
> where that constant is LIST_POISON2 and the "and $2" seems to be 
> TIMER_IRQSAFE. So the trapping instruction *looks* like it's doing 
> __list_del() on the timer, and timer->next is NULL.
> 
> So somebody added a timer, and then deallocated/cleared the structure 
> before it triggered. The problem is, I can't see a way to figure out 
> _who_ did that.

I think CONFIG_DEBUG_OBJECTS_TIMERS=y should be able to detect that?

Debugobjects hooks into deallocation paths and complains immediately if a 
live timer is zapped that way.

If the corrupion does not involve deallocation then it might be more 
difficult to detect but not impossible either: for example if an object is 
not freed but reused incorrectly then a repeat use of any timer function 
will cause the debugobjects (and/or the timer code) to complain.

So I'd suggest trying debugobjects, it should catch a fair number of 
non-exotic object corruption patterns.

Thanks,

	Ingo
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ