linux-kernel - Re: Kernel Linux 2.6.23.16 hangs when run updatedb

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <alpine.LFD.1.00.0803110803280.3781@apollo.tec.linutronix.de>
Date:	Tue, 11 Mar 2008 09:11:32 +0100 (CET)
From:	Thomas Gleixner <tglx@...utronix.de>
To:	"Renato S. Yamane" <renatoyamane@...dic.com.br>
cc:	linux-kernel@...r.kernel.org, alan-jenkins@...fmail.co.uk,
	devzero@....de, mingo@...e.hu
Subject: Re: Kernel Linux 2.6.23.16 hangs when run updatedb

On Mon, 10 Mar 2008, Renato S. Yamane wrote:
> > > How can I fix this? Is safe run reiserfsck?
> > 
> > 
> > I think he's wrong.
> > 
> > Looking at the call trace, the BUG happens during an interrupt.  It
> > could be a coincidence that the interrupt happened during this
> > particular system call.
> > 
> > It looks like a timer callback has been corrupted / set to an invalid
> > value.  The BUG is due to accessing the invalid address 61060fe0
> > within enqueue_hrtimer, and the EIP (instruction pointer) is also
> > equal to 61060fe0.  This would be consistent with the source code of
> > enqueue_hrtimer.  It's not an obvious reiserfs issue.

It's pretty inconsistent with the source code of enqueue_hrtimer().

The only possibility to have a callback from enqueue_hrtimer() is in
hrtimer_enqueue_reprogram() in the HRTIMER_CB_IRQSAFE_NO_RESTART
case. Such timers can not be requeued in interrupt context, hence the
name HRTIMER_CB_IRQSAFE_NO_RESTART :)

Also in hrtimer_interrupt context, hrtimer_enqueue() is called with
reprogram = 0, which ensures that we do not call
hrtimer_enqueue_reprogram().

> > I don't know how to find out where this corruption is happening, but
> > it's worth asking the hrtimers people.

Let's gather some more information. 

Renato, some questions:

1) is this fully reproducible with updatedb ?

2) are you sure that this is the first stacktrace you captured, there
might be some BUG before that which scrolled out of sight. Any chance
to use a serial console ?

3) Can you please recompile the kernel with CONFIF_DEBUG_INFO set
and then run the following addresses from the backtrace through
addr2line with the new vmlinux:

# addr2line -e vmlinux 0xc013dad9 0xc0107c3b

Please provide the output.

4) Looking at your .config it seems you have some more patches applied
aside of the .16 stable. Can you please upload a full patch queue
somewhere ?

Thanks,

	tglx
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/