linux-kernel - Re: frequent lockups in 3.18rc4

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <CA+55aFw7vJkuJ9RtVS3yhPsqDos+ii1kdJBZEeoxhb9c2=rStQ@mail.gmail.com>
Date:	Fri, 12 Dec 2014 11:14:06 -0800
From:	Linus Torvalds <torvalds@...ux-foundation.org>
To:	Dave Jones <davej@...hat.com>,
	Linus Torvalds <torvalds@...ux-foundation.org>,
	Chris Mason <clm@...com>,
	Mike Galbraith <umgwanakikbuti@...il.com>,
	Ingo Molnar <mingo@...nel.org>,
	Peter Zijlstra <peterz@...radead.org>,
	Dâniel Fraga <fragabr@...il.com>,
	Sasha Levin <sasha.levin@...cle.com>,
	"Paul E. McKenney" <paulmck@...ux.vnet.ibm.com>,
	Linux Kernel Mailing List <linux-kernel@...r.kernel.org>
Subject: Re: frequent lockups in 3.18rc4

On Fri, Dec 12, 2014 at 10:54 AM, Dave Jones <davej@...hat.com> wrote:
>
> Something that's still making me wonder if it's some kind of hardware
> problem is the non-deterministic nature of this bug.

I'd expect it to be a race condition, though. Which can easily cause
these kinds of issues, and the timing will be pretty random even if
the load is very regular.

And we know that the scheduler has an integer overflow under Sasha's
loads, although I didn't hear anything from Ingo and friends about it.
Ingo/Peter, you were cc'd on that report, where at least one of the
multiplcations in wake_affine() ended up overflowing..

Some scheduler thing that overflows only under heavy load, and screws
up scheduling could easily account for the RCU thread thing. I see it
*less* easily accounting for DaveJ's case, though, because the
watchdog is running at RT priority,  and the scheduler would have to
screw up much more to then not schedule an RT task, but..

I'm also not sure if the bug ever happens with preemption disabled.
Sasha, was that you who reported that you cannot reproduce it without
preemption? It strikes me that there's a race condition in
__cond_resched() wrt preemption, for example: we do

        __preempt_count_add(PREEMPT_ACTIVE);
        __schedule();
        __preempt_count_sub(PREEMPT_ACTIVE);

and in between the __schedule() and __preempt_count_sub(), if an
interrupt comes in and wakes up some important process, it won't
reschedule (because preemption is active), but then we enable
preemption again and don't check whether we should reschedule (again),
and we just go on our merry ways.

Now, I don't see how that could really matter for a long time -
returning to user space will check need_resched, and sleeping will
obviously force a reschedule anyway, so these kinds of races should at
most delay things by just a tiny amount, but maybe there is some case
where we screw up in a bigger way. So I do *not* believe that the one
in __cond_resched() matters, but I'm giving it as an example of the
kind of things that could go wrong.

                        Linus
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/