[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <CALCETrX_ys9dJZOkN7QBjWz9Y1iukQDaJaASPAkGtWvwNkOESw@mail.gmail.com>
Date: Sat, 6 Apr 2019 06:54:43 -0700
From: Andy Lutomirski <luto@...nel.org>
To: unlisted-recipients:; (no To-header on input)
Cc: Andy Lutomirski <luto@...nel.org>,
LKML <linux-kernel@...r.kernel.org>,
Thomas Gleixner <tglx@...utronix.de>
Subject: Re: 8b275b3754 ("x86/irq/64: Remap the IRQ stack with guard pages"):
BUG: unable to handle kernel paging request at ffffb659000a1000
On Fri, Apr 5, 2019 at 11:38 PM kernel test robot <lkp@...el.com> wrote:
>
> Greetings,
>
> 0day kernel testing robot got the below dmesg and the first bad commit is
>
> https://git.kernel.org/pub/scm/linux/kernel/git/tip/tip.git WIP.x86/stackguards
>
> commit 8b275b3754465d502d393f8ae8dd355b7067e73f
> Author: Andy Lutomirski <luto@...nel.org>
> AuthorDate: Fri Jul 13 19:01:23 2018 -0700
> Commit: Thomas Gleixner <tglx@...utronix.de>
> CommitDate: Fri Apr 5 17:04:10 2019 +0200
>
> x86/irq/64: Remap the IRQ stack with guard pages
>
> The IRQ stack lives in percpu space, so an IRQ handler that overflows it
> will overwrite other data structures.
>
> Use vmap() to remap the IRQ stack so that it will have the usual guard
> pages that vmap/vmalloc allocations have. With this the kernel will panic
> immediately on an IRQ stack overflow.
>
> [ tglx: Move the map code to a proper place and invoke it only when a CPU
> is about to be brought online. No point in installing the map at
> early boot for all possible CPUs. Fail the CPU bringup if the vmap
> fails as done for all other preparatory stages in cpu hotplug. ]
>
> Signed-off-by: Andy Lutomirski <luto@...nel.org>
> Signed-off-by: Thomas Gleixner <tglx@...utronix.de>
I haven't spotted the actual bug yet, but the faulting instruction is:
2a: 65 8b 35 09 ca 75 63 mov %gs:*0x6375ca09(%rip),%esi
# 0x6375ca3a <-- trapping instruction
This seems to be faulting just above the top of the stack (the thing
in RSP), so I suspect that there is some path that is shoving the
remapped value into GSBASE, which is wrong.
Also, FWIW, there was some reason that I initialized all the virtual
mappings for all possible CPUs early. I don't remember what it was,
and it may not have been a good reason, but I put at least some
nonzero amount of thought into it :)
--Andy
Powered by blists - more mailing lists