linux-kernel - Re: linux-next: Tree for December 11

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Date:	Wed, 7 Jan 2009 18:17:30 +0100
From:	Ingo Molnar <mingo@...e.hu>
To:	Linus Torvalds <torvalds@...ux-foundation.org>
Cc:	Alexey Zaytsev <alexey.zaytsev@...il.com>,
	LKML <linux-kernel@...r.kernel.org>,
	Nick Piggin <nickpiggin@...oo.com.au>,
	Peter Zijlstra <a.p.zijlstra@...llo.nl>
Subject: Re: linux-next: Tree for December 11


* Linus Torvalds <torvalds@...ux-foundation.org> wrote:

> On Wed, 7 Jan 2009, Alexey Zaytsev wrote:
> >
> > Almost a month later, the warning is still there, and now also in 
> > Linus' git. Am I the only one who sees it?
> 
> Possibly. But that may be because most people don't have DEBUG_PREEMPT.

hm, i never saw this warning and i run tons of different kernels (on 
different hw with different build environments) so if this was a more 
generic BKL init problem i'd expect to have seen it one way or another.

But i think the bug that Alexey is seeing is genuine:

> > [    0.004150] WARNING: at kernel/sched.c:4435 sub_preempt_count+0xae/0xc0()
> > [    0.004247] Hardware name: HP Compaq nx7300 (GB848ES#ACB)
> > [    0.004342] Modules linked in:
> > [    0.004477] Pid: 0, comm: swapper Not tainted 2.6.28-06859-gede6f5a #179
> > [    0.004575] Call Trace:
> > [    0.004672]  [<c012fbe6>] warn_slowpath+0x86/0xa0
> > [    0.004770]  [<c014e99b>] ? tick_check_oneshot_change+0x4b/0x100
> > [    0.004868]  [<c046dc00>] ? _spin_unlock_irq+0x10/0x30
> > [    0.004963]  [<c047045e>] sub_preempt_count+0xae/0xc0
> > [    0.005060]  [<c0134787>] _local_bh_enable+0x27/0xa0
> 
> Hmm. _local_bh_enable() would make the preempt_count go negative or fall 
> below 1 (with kernel lock held).
> 
> > [    0.005155]  [<c0134ac7>] __do_softirq+0xf7/0x150
> > [    0.005250]  [<c01349d0>] ? __do_softirq+0x0/0x150
> > [    0.005345]  <IRQ>  [<c014ec2e>] ? tick_nohz_update_jiffies+0xe/0x50
> > [    0.005488]  [<c013494f>] ? irq_exit+0x7f/0x90
> > [    0.005584]  [<c0104f23>] ? do_IRQ+0xa3/0x120
> > [    0.005678]  [<c01038a7>] ? common_interrupt+0x27/0x2c
> > [    0.005773]  [<c013007b>] ? try_acquire_console_sem+0x1b/0x30
> > [    0.005872]  [<c05f7378>] ? check_bugs+0xb8/0xe0
> > [    0.005967]  [<c05ef98a>] ? start_kernel+0x25a/0x2f0
> 
> .. and it happens early on, when we take an interrupt in check_bugs.
> 
> Are we ready to enable interrupts there? Maybe the page fault we took 
> (on purpose) enabled interrupts and we now take the irq much too early.
> 
> Or maybe the initial kernel lock didn't set preempt_count to 1.
> 
> Ingo, any ideas?

i dont think it's a BKL problem - i think something in linux-next which 
went upstream recently leaked a preempt count.

What is weird is that we have detection for softirq preemption leaks:

                        if (unlikely(prev_count != preempt_count())) {
                                printk(KERN_ERR "huh, entered softirq %td %p"
                                       "with preempt_count %08x,"
                                       " exited with %08x?\n", h - softirq_vec,
                                       h->action, prev_count, preempt_count());
                                preempt_count() = prev_count;

And this did not appear to have triggered for Alexey. So this probably 
cannot be a softirq handler leak, nor an initcall handler leak.

The only possibility seems to be for this to be a preempt leak in hardirq 
handler - i.e. locking inbalance by some driver. It did not get detected 
there (because the preempt count was kept elevated by the BKL and by the 
softirq handler), but when the softirq got released. The stack has some 
hardirq signs, but no specific blame for which driver might be 
responsible.

This would be a candidate for bisection i suspect, barring any better, 
more specific ideas (like a specific timeframe for when this hit upstream, 
or a better stacktrace of the hardirq context that triggered this softirq 
instance).

	Ingo
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/