[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20210304201906.GM3014244@iweiny-DESK2.sc.intel.com>
Date: Thu, 4 Mar 2021 12:19:06 -0800
From: Ira Weiny <ira.weiny@...el.com>
To: Andy Lutomirski <luto@...nel.org>
Cc: x86@...nel.org, LKML <linux-kernel@...r.kernel.org>,
Mark Rutland <mark.rutland@....com>,
Josh Poimboeuf <jpoimboe@...hat.com>
Subject: Re: [PATCH v3 04/11] x86/kthread,dumpstack: Set
task_pt_regs->cs.RPL=3 for kernel threads
On Thu, Mar 04, 2021 at 11:05:57AM -0800, Andy Lutomirski wrote:
> For kernel threads, task_pt_regs is currently all zeros, a valid user state
> (if kernel_execve() has been called), or some combination thereof during
> execution of kernel_execve(). If a stack trace is printed, the unwinder
> might get confused and treat task_pt_regs as a kernel state. Indeed,
> forcing a stack dump results in an attempt to display _kernel_ code bytes
> from a bogus address at the very top of kernel memory.
>
> Fix the confusion by setting cs=3 so that user_mode(task_pt_regs(...)) ==
> true for kernel threads.
>
> Also improve the error when failing to fetch code bytes to show which type
> of fetch failed. This makes it much easier to understand what's happening.
>
> Cc: Josh Poimboeuf <jpoimboe@...hat.com>
> Signed-off-by: Andy Lutomirski <luto@...nel.org>
> ---
> arch/x86/kernel/dumpstack.c | 4 ++--
> arch/x86/kernel/process.c | 13 +++++++++++++
> 2 files changed, 15 insertions(+), 2 deletions(-)
>
> diff --git a/arch/x86/kernel/dumpstack.c b/arch/x86/kernel/dumpstack.c
> index 55cf3c8325c6..9b7b69bb12e5 100644
> --- a/arch/x86/kernel/dumpstack.c
> +++ b/arch/x86/kernel/dumpstack.c
> @@ -128,8 +128,8 @@ void show_opcodes(struct pt_regs *regs, const char *loglvl)
> /* No access to the user space stack of other tasks. Ignore. */
> break;
> default:
> - printk("%sCode: Unable to access opcode bytes at RIP 0x%lx.\n",
> - loglvl, prologue);
> + printk("%sCode: Unable to access %s opcode bytes at RIP 0x%lx.\n",
> + loglvl, user_mode(regs) ? "user" : "kernel", prologue);
> break;
> }
> }
> diff --git a/arch/x86/kernel/process.c b/arch/x86/kernel/process.c
> index 145a7ac0c19a..f6f16df04cb9 100644
> --- a/arch/x86/kernel/process.c
> +++ b/arch/x86/kernel/process.c
> @@ -163,6 +163,19 @@ int copy_thread(unsigned long clone_flags, unsigned long sp, unsigned long arg,
> /* Kernel thread ? */
> if (unlikely(p->flags & PF_KTHREAD)) {
> memset(childregs, 0, sizeof(struct pt_regs));
> +
> + /*
> + * Even though there is no real user state here, these
> + * are were user regs belong, and kernel_execve() will
^^^^^
where?
Ira
> + * overwrite them with user regs. Put an obviously
> + * invalid value that nonetheless causes user_mode(regs)
> + * to return true in CS.
> + *
> + * This also prevents the unwinder from thinking that there
> + * is invalid kernel state at the top of the stack.
> + */
> + childregs->cs = 3;
> +
> kthread_frame_init(frame, sp, arg);
> return 0;
> }
> --
> 2.29.2
>
Powered by blists - more mailing lists