[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <CALCETrVOW1U8uDKZhChP_PgG40jsJE4F0Jeyfj1BT=qJSFY6yw@mail.gmail.com>
Date: Wed, 19 Nov 2014 11:03:48 -0800
From: Andy Lutomirski <luto@...capital.net>
To: Frederic Weisbecker <fweisbec@...il.com>
Cc: Linus Torvalds <torvalds@...ux-foundation.org>,
Dave Jones <davej@...hat.com>, Don Zickus <dzickus@...hat.com>,
Thomas Gleixner <tglx@...utronix.de>,
Linux Kernel <linux-kernel@...r.kernel.org>,
"the arch/x86 maintainers" <x86@...nel.org>,
Peter Zijlstra <peterz@...radead.org>,
Arnaldo Carvalho de Melo <acme@...stprotocols.net>
Subject: Re: frequent lockups in 3.18rc4
On Wed, Nov 19, 2014 at 11:02 AM, Frederic Weisbecker
<fweisbec@...il.com> wrote:
> On Wed, Nov 19, 2014 at 09:40:26AM -0800, Linus Torvalds wrote:
>> On Wed, Nov 19, 2014 at 9:22 AM, Linus Torvalds
>> <torvalds@...ux-foundation.org> wrote:
>> >
>> > So it hasn't actually done the "push %rbx; popfq" part - there must be
>> > a label at the return part, and context_tracking_user_exit() never
>> > actually did the local_irq_save/restore at all. Which means that it
>> > took one of the early exits instead:
>> >
>> > if (!context_tracking_is_enabled())
>> > return;
>> >
>> > if (in_interrupt())
>> > return;
>>
>> Ho humm. Interesting. Neither of those should possibly have happened.
>>
>> We "know" that "context_tracking_is_enabled()" must be true, because
>> the only way we get to context_tracking_user_exit() in the first place
>> is through "user_exit()", which does:
>>
>> if (context_tracking_is_enabled())
>> context_tracking_user_exit();
>>
>> and we know we shouldn't be in_interrupt(), because the backtrace is
>> the system call entry path, for chrissake!
>>
>> So we definitely have some corruption going on. A few possibilities:
>>
>> - either the register contents are corrupted (%rbx in your dump said
>> "0x0000000100000046", but the eflags we restored was 0x246)
>>
>> - in_interrupt() is wrong, and we've had some irq_count() corruption.
>> I'd expect that to result in "scheduling while atomic" messages,
>> though, especially if it goes on long enough that you get a watchdog
>> event..
>>
>> - there is something rotten in the land of
>> context_tracking_is_enabled(), which uses a static key.
>>
>> - I have misread the whole trace, and am a moron. But your earlier
>> report really had some very similar things, just in
>> context_tracking_user_enter() instead of exit.
>>
>> In your previous oops, the registers that was allegedly used to
>> restore %eflags was %r12:
>>
>> 28: 41 54 push %r12
>> 2a: 9d popfq
>> 2b:* 5b pop %rbx <-- trapping instruction
>> 2c: 41 5c pop %r12
>> 2e: 5d pop %rbp
>> 2f: c3 retq
>>
>> but:
>>
>> R12: ffff880101ee3ec0
>> EFLAGS: 00000282
>>
>> so again, it looks like we never actually did that "popfq"
>> instruction, and it would have exited through the (same) early exits.
>>
>> But what an odd coincidence that it ended up in both of your reports
>> being *exactly* at that instruction after the "popf". If it had
>> actually *taken* the popf, I'd not be so surprised ("ok, popf enabled
>> interrupts, and there was an interrupt pending"), but since everything
>> seems to say that it came there through some control flow that did
>> *not* go through the popf, that's just a very odd coincidence.
>>
>> And both context_tracking_user_enter() and exit() have that exact same
>> issue with the early returns. They shouldn't have happened in the
>> first place.
>
> I got a report lately involving context tracking. Not sure if it's
> the same here but the issue was that context tracking uses per cpu data
> and per cpu allocation use vmalloc and vmalloc'ed area can fault due to
> lazy paging.
Wait, what? If something like kernel_stack ends with an unmapped pmd,
we are well and truly screwed.
--Andy
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
Powered by blists - more mailing lists