linux-kernel - Re: [PATCH 4/6] scs: Move scs_overflow

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <20200518133231.GC2787@C02TD0UTHF1T.local>
Date:   Mon, 18 May 2020 14:32:31 +0100
From:   Mark Rutland <mark.rutland@....com>
To:     Will Deacon <will@...nel.org>
Cc:     linux-kernel@...r.kernel.org, linux-arm-kernel@...ts.infradead.org,
        Sami Tolvanen <samitolvanen@...gle.com>,
        Kees Cook <keescook@...omium.org>,
        Catalin Marinas <catalin.marinas@....com>,
        Mark Rutland <mark.rutland@...com>,
        Jann Horn <jannh@...gle.com>, Ard Biesheuvel <ardb@...nel.org>,
        Peter Zijlstra <peterz@...radead.org>, kernel-team@...roid.com
Subject: Re: [PATCH 4/6] scs: Move scs_overflow_check() out of architecture
 code

On Mon, May 18, 2020 at 02:23:47PM +0100, Will Deacon wrote:
> On Mon, May 18, 2020 at 01:12:10PM +0100, Mark Rutland wrote:
> > On Fri, May 15, 2020 at 06:27:54PM +0100, Will Deacon wrote:
> > > There is nothing architecture-specific about scs_overflow_check() as
> > > it's just a trivial wrapper around scs_corrupted().
> > > 
> > > For parity with task_stack_end_corrupted(), rename scs_corrupted() to
> > > task_scs_end_corrupted() and call it from schedule_debug() when
> > > CONFIG_SCHED_STACK_END_CHECK_is enabled. Finally, remove the unused
> > > scs_overflow_check() function entirely.
> > > 
> > > This has absolutely no impact on architectures that do not support SCS
> > > (currently arm64 only).
> > > 
> > > Signed-off-by: Will Deacon <will@...nel.org>
> > 
> > Pulling this out of arch code seems sane to me, and the arch-specific
> > chanes look sound. However, I have a concern with the changes within the
> > scheduler context-switch.
> > 
> > > diff --git a/arch/arm64/kernel/process.c b/arch/arm64/kernel/process.c
> > > index a35d3318492c..56be4cbf771f 100644
> > > --- a/arch/arm64/kernel/process.c
> > > +++ b/arch/arm64/kernel/process.c
> > > @@ -52,7 +52,6 @@
> > >  #include <asm/mmu_context.h>
> > >  #include <asm/processor.h>
> > >  #include <asm/pointer_auth.h>
> > > -#include <asm/scs.h>
> > >  #include <asm/stacktrace.h>
> > >  
> > >  #if defined(CONFIG_STACKPROTECTOR) && !defined(CONFIG_STACKPROTECTOR_PER_TASK)
> > > @@ -516,7 +515,6 @@ __notrace_funcgraph struct task_struct *__switch_to(struct task_struct *prev,
> > >  	entry_task_switch(next);
> > >  	uao_thread_switch(next);
> > >  	ssbs_thread_switch(next);
> > > -	scs_overflow_check(next);
> > 
> > Prior to this patch, we'd never switch to a task whose SCS had already
> > been corrupted.
> > 
> > With this patch, we only check that when switching away from a task, and
> > only when CONFIG_SCHED_STACK_END_CHECK is selected, which at first
> > glance seems to weaken that.
> 
> Yes, ignoring vmap'd stacks, this patch brings the SCS checking in-line with
> the main stack checking when CONFIG_SCHED_STACK_END_CHECK=y.
> 
> > Arguably:
> > 
> > * If the next task's SCS was corrupted by that task while it was
> >   running, we had already lost at that point.
> 
> With this change, we'll at least catch this one sooner, and that might be
> useful if a bug has caused us to overflow the SCS but not the main stack.

Sure, but only if CONFIG_SCHED_STACK_END_CHECK is selected.

> > * If the next task's SCS was corrupted by another task, then that could
> >   also happen immediately after the check (though timing to avoid the
> >   check but affect the process could be harder).
> 
> We're only checking the magic end value, so the cross-task case is basically
> if you overrun your own SCS as above, but then continue to overrun entire
> SCSs for other tasks as well. It's probably not very useful in that case.
> 
> > ... and a VMAP'd SCS would be much nicer in this regard.
> > 
> > Do we think this is weakening the check, or do we think it wasn't all
> > that helpful to begin with?
> 
> I see it as a debug check to catch SCS overflow, rather than a hardening
> feature, and I agree that using something like vmap stack for the SCS would
> be better because we could have a guard page instead.

Fair enough. Could we put something into the commit message that more
explicitly calls out debug-not-hardening? I agree that under that model
this patch looks fine, and with something to that effect:

Reviewed-by: Mark Rutland <mark.rutland@....com>

Mark.

> This is something I would like to revisit, but we need more
> information from Sami about why Android rejected the larger allocation
> size, since I don't think there's an awful lot of point merging this
> series if Android doesn't pick it up.

Indeed. I'd certainly prefer the robustness of a VMAP'd SCS if we can do
that.

Mark.