[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <759cd319-990f-af23-2f1c-aba55d0768b8@bytedance.com>
Date: Fri, 19 Nov 2021 18:02:50 +0800
From: Qi Zheng <zhengqi.arch@...edance.com>
To: Peter Zijlstra <peterz@...radead.org>,
Josh Poimboeuf <jpoimboe@...hat.com>
Cc: Linus Torvalds <torvalds@...ux-foundation.org>,
Holger Hoffst??tte <holger@...lied-asynchrony.com>,
Kees Cook <keescook@...omium.org>,
Thomas Gleixner <tglx@...utronix.de>,
Justin Forbes <jmforbes@...uxtx.org>,
Greg Kroah-Hartman <gregkh@...uxfoundation.org>,
Linux Kernel Mailing List <linux-kernel@...r.kernel.org>,
Andrew Morton <akpm@...ux-foundation.org>,
Guenter Roeck <linux@...ck-us.net>,
Shuah Khan <shuah@...nel.org>, patches@...nelci.org,
lkft-triage@...ts.linaro.org, Pavel Machek <pavel@...x.de>,
Jon Hunter <jonathanh@...dia.com>,
Florian Fainelli <f.fainelli@...il.com>,
stable <stable@...r.kernel.org>
Subject: Re: [PATCH] x86: Pin task-stack in __get_wchan()
On 11/19/21 5:29 PM, Peter Zijlstra wrote:
> On Thu, Nov 18, 2021 at 06:04:27PM -0800, Josh Poimboeuf wrote:
>> On Thu, Nov 18, 2021 at 01:11:09PM +0100, Peter Zijlstra wrote:
>
>>> I now have the below, the only thing missing is that there's a
>>> user_mode() call on a stack based regs. Now on x86_64 we can
>>> __get_kernel_nofault() regs->cs and call it a day, but on i386 we have
>>> to also fetch regs->flags.
>>>
>>> Is this really the way to go?
>>
>> Please no. Can we just add a check in unwind_start() to ensure the
>> caller did try_get_task_stack()?
>
> I tried; but at best it's fundamentally racy and in practise its worse
> because init_task doesn't seem to believe in refcounts and kthreads are
> odd for some raisin. Now those are fixable, but given the fundamental
> races, I don't see how it's ever going to be reliable.
>
> I don't mind the __get_kernel_nofault() usage and think I can do a
> better implementation that will allow us to get rid of the
> pagefault_{dis,en}able() sprinkling, but that's for another day. It's
> just the user_mode(regs) usage that's going to be somewhat ugleh.
>
> Anyway, below is the minimal fix for the situation at hand. I'm not
> going to be around much today, so if Linus wants to pick that up instead
> of mass revert things that's obviously fine too.
>
> ---
> Subject: x86: Pin task-stack in __get_wchan()
>
> When commit 5d1ceb3969b6 ("x86: Fix __get_wchan() for !STACKTRACE")
> moved from stacktrace to native unwind_*() usage, the
> try_get_task_stack() got lost, leading to use-after-free issues for
> dying tasks.
>
> Signed-off-by: Peter Zijlstra (Intel) <peterz@...radead.org>
> ---
> arch/x86/kernel/process.c | 5 +++++
> 1 file changed, 5 insertions(+)
>
> diff --git a/arch/x86/kernel/process.c b/arch/x86/kernel/process.c
> index e9ee8b526319..04143a653a8a 100644
> --- a/arch/x86/kernel/process.c
> +++ b/arch/x86/kernel/process.c
> @@ -964,6 +964,9 @@ unsigned long __get_wchan(struct task_struct *p)
> struct unwind_state state;
> unsigned long addr = 0;
>
> + if (!try_get_task_stack(p))
> + return 0;
> +
> for (unwind_start(&state, p, NULL, NULL); !unwind_done(&state);
> unwind_next_frame(&state)) {
> addr = unwind_get_return_address(&state);
> @@ -974,6 +977,8 @@ unsigned long __get_wchan(struct task_struct *p)
> break;
> }
>
> + put_task_stack(p);
> +
> return addr;
> }
>
>
This implementation is very similar to stack_trace_save_tsk(), maybe we
can just move stack_trace_save_tsk() out of CONFIG_STACKTRACE and reuse
it.
--
Thanks,
Qi
Powered by blists - more mailing lists