linux-kernel - Re: frequent lockups in 3.18rc4

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <CA+55aFx4zjAVHN5DrSCh1_MJjqf4fxoAVT3RmC+1QGP6bq7b0Q@mail.gmail.com>
Date:	Fri, 19 Dec 2014 15:55:23 -0800
From:	Linus Torvalds <torvalds@...ux-foundation.org>
To:	Thomas Gleixner <tglx@...utronix.de>
Cc:	Dave Jones <davej@...hat.com>, Chris Mason <clm@...com>,
	Mike Galbraith <umgwanakikbuti@...il.com>,
	Ingo Molnar <mingo@...nel.org>,
	Peter Zijlstra <peterz@...radead.org>,
	Dâniel Fraga <fragabr@...il.com>,
	Sasha Levin <sasha.levin@...cle.com>,
	"Paul E. McKenney" <paulmck@...ux.vnet.ibm.com>,
	Linux Kernel Mailing List <linux-kernel@...r.kernel.org>,
	Suresh Siddha <sbsiddha@...il.com>,
	Oleg Nesterov <oleg@...hat.com>,
	Peter Anvin <hpa@...ux.intel.com>
Subject: Re: frequent lockups in 3.18rc4

On Fri, Dec 19, 2014 at 3:14 PM, Thomas Gleixner <tglx@...utronix.de> wrote:
>
> Now that all looks correct. So there is something else going on. After
> staring some more at it, I think we are looking at it from the wrong
> angle.
>
> The watchdog always detects CPU1 as stuck and we got completely
> fixated on the csd_wait() in the stack trace on CPU1. Now we have
> stack traces which show a different picture, i.e. CPU1 makes progress
> after a gazillion of seconds.

.. but that doesn't explain why CPU0 ends up always being at that
*exact* same instruction in the NMI backtrace.

While a fairly tight loop, together with "mmio read is very expensive
and synchronizing" would explain it. An MMIO read can easily be as
expensive as several thousand instructions.

> I think we really need to look at CPU1 itself.

Not so fast. Take another look at CPU0.

[24998.083577]  [<ffffffff810e0d3e>] ktime_get+0x3e/0xa0
[24998.084450]  [<ffffffff810e9cd3>] tick_sched_timer+0x23/0x160
[24998.085315]  [<ffffffff810daf96>] __run_hrtimer+0x76/0x1f0
[24998.086173]  [<ffffffff810e9cb0>] ? tick_init_highres+0x20/0x20
[24998.087025]  [<ffffffff810db2e7>] hrtimer_interrupt+0x107/0x260
[24998.087877]  [<ffffffff81031a4b>] local_apic_timer_interrupt+0x3b/0x70
[24998.088732]  [<ffffffff8179bca5>] smp_apic_timer_interrupt+0x45/0x60
[24998.089583]  [<ffffffff8179a0df>] apic_timer_interrupt+0x6f/0x80
[24998.090435]  <EOI>
[24998.091279]  [<ffffffff810da66e>] ? __remove_hrtimer+0x4e/0xa0
[24998.092118]  [<ffffffff812c7c7a>] ? ipcget+0x8a/0x1e0
[24998.092951]  [<ffffffff812c7c6c>] ? ipcget+0x7c/0x1e0
[24998.093779]  [<ffffffff812c8d6d>] SyS_msgget+0x4d/0x70


Really. None of that changed. NONE. The likelihood that we hit the
exact same instruction every time? Over a timefraem of more than a
minute?

The only way I see that happening is (a) NMI is completely buggered,
and the backtrace is just random crap that is always the same.  Or (b)
it's really a fairly tight loop.

The fact that you had a hrtimer interrupt happen in the *middle* of
__remove_hrtimer() is really another fairly strong hint. That smells
like "__remove_hrtimer() has a race with hrtimer interrupts".

And that race results in a basically endless loop (which perhaps ends
when the HRtimer overflows, in what, a few minutes?)

I really don't think you should look at CPU1. Not when CPU0 has such
an interesting pattern that you dismissed just because the HPET is
making progress.

                            Linus
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/