lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <Yt+4qSeIM0WbjcJj@shell.armlinux.org.uk>
Date:   Tue, 26 Jul 2022 10:49:29 +0100
From:   "Russell King (Oracle)" <linux@...linux.org.uk>
To:     Li Huafei <lihuafei1@...wei.com>
Cc:     Linus Walleij <linus.walleij@...aro.org>, ardb@...nel.org,
        will@...nel.org, mark.rutland@....com, broonie@...nel.org,
        peterz@...radead.org, mingo@...hat.com, acme@...nel.org,
        alexander.shishkin@...ux.intel.com, jolsa@...nel.org,
        namhyung@...nel.org, arnd@...db.de, rostedt@...dmis.org,
        nick.hawkins@....com, john@...ozen.org, mhiramat@...nel.org,
        ast@...nel.org, linyujun809@...wei.com, ndesaulniers@...gle.com,
        linux-arm-kernel@...ts.infradead.org, linux-kernel@...r.kernel.org,
        linux-perf-users@...r.kernel.org
Subject: Re: [PATCH 3/5] ARM: stacktrace: Allow stack trace saving for
 non-current tasks

On Tue, Jul 26, 2022 at 05:12:39PM +0800, Li Huafei wrote:
> 
> 
> On 2022/7/18 17:07, Linus Walleij wrote:
> > On Tue, Jul 12, 2022 at 4:18 AM Li Huafei <lihuafei1@...wei.com> wrote:
> > 
> >> The current ARM implementation of save_stack_trace_tsk() does not allow
> >> saving stack trace for non-current tasks, which may limit the scenarios
> >> in which stack_trace_save_tsk() can be used. Like other architectures,
> >> or like ARM's unwind_backtrace(), we can leave it up to the caller to
> >> ensure that the task that needs to be unwound is not running.
> >>
> >> Signed-off-by: Li Huafei <lihuafei1@...wei.com>
> > 
> > That sounds good, but:
> > 
> >>         if (tsk != current) {
> >> -#ifdef CONFIG_SMP
> >> -               /*
> >> -                * What guarantees do we have here that 'tsk' is not
> >> -                * running on another CPU?  For now, ignore it as we
> >> -                * can't guarantee we won't explode.
> >> -                */
> >> -               return;
> >> -#else
> >> +               /* task blocked in __switch_to */
> > 
> > The commit text is not consistent with the comment you are removing.
> > 
> > The commit is talking about "non-current" tasks which is one thing,
> > but the code is avoiding any tasks under SMP because they may be
> > running on another CPU. So you need to update the commit
> > message to say something like "non-current or running on another CPU".
> > 
> > If this condition will be checked at call sites in following patches,
> > then mention
> > that in the commit as well, so we know the end result is that we do
> > not break it,
> 
> The generic code stack_trace_save_tsk() does not have this check, and by
> 'caller' I mean the caller of stack_trace_save_tsk(), expecting the
> 'caller' to ensure that the task is not running. So in effect this check
> has been dropped and there is no more guarantee. Sorry for not
> clarifying the change here.

Can you prove in every case that the thread we're being asked to unwind
is not running? I don't think you can.

There are things like proc_pid_stack() in procfs and the stack traces
in sysrq-t which have attempted to unwind everything whether it's
running or not.

So no, there is no guarantee that the thread is blocked in
__switch_to().

> But can we assume that the user should know that the stacktrace is
> unreliable for a task that is running on another CPU? If not, I should
> remove this patch and keep the check.

It's not about "unreliable" stack traces, it's about the unwinder
killing the kernel.

The hint is this:

                frame.fp = thread_saved_fp(tsk);
                frame.sp = thread_saved_sp(tsk);
                frame.lr = 0;           /* recovered from the stack */
                frame.pc = thread_saved_pc(tsk);

These access the context saved by the scheduler when the task is
sleeping. When the thread is running, these saved values will be
the state when the thread last slept. However, with the thread
running, the stack could now contain any data what so ever, and
could change at any moment.

Whether the unwind-table unwinder is truely safe in such a
situation is unknown - we try to ensure that it won't do anything
stupid, but proving that is a hard task, and we've recently had
issues with the unwinder even without that.

So, allowing this feels like we're opening the door to DoS attacks
from userspace, where userspace sits there reading /proc/*/stack of
some thread running on a different CPU waiting for the kernel to
oops itself, possibly holding a lock, resulting in the system
dying.

These decisions need to be made by architecture code not generic
code, particularly where the method of unwinding is architecture
specific and thus may have criteria defining when its safe to do so.

-- 
RMK's Patch system: https://www.armlinux.org.uk/developer/patches/
FTTP is here! 40Mbps down 10Mbps up. Decent connectivity at last!

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ