[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20200518133231.GC2787@C02TD0UTHF1T.local>
Date: Mon, 18 May 2020 14:32:31 +0100
From: Mark Rutland <mark.rutland@....com>
To: Will Deacon <will@...nel.org>
Cc: linux-kernel@...r.kernel.org, linux-arm-kernel@...ts.infradead.org,
Sami Tolvanen <samitolvanen@...gle.com>,
Kees Cook <keescook@...omium.org>,
Catalin Marinas <catalin.marinas@....com>,
Mark Rutland <mark.rutland@...com>,
Jann Horn <jannh@...gle.com>, Ard Biesheuvel <ardb@...nel.org>,
Peter Zijlstra <peterz@...radead.org>, kernel-team@...roid.com
Subject: Re: [PATCH 4/6] scs: Move scs_overflow_check() out of architecture
code
On Mon, May 18, 2020 at 02:23:47PM +0100, Will Deacon wrote:
> On Mon, May 18, 2020 at 01:12:10PM +0100, Mark Rutland wrote:
> > On Fri, May 15, 2020 at 06:27:54PM +0100, Will Deacon wrote:
> > > There is nothing architecture-specific about scs_overflow_check() as
> > > it's just a trivial wrapper around scs_corrupted().
> > >
> > > For parity with task_stack_end_corrupted(), rename scs_corrupted() to
> > > task_scs_end_corrupted() and call it from schedule_debug() when
> > > CONFIG_SCHED_STACK_END_CHECK_is enabled. Finally, remove the unused
> > > scs_overflow_check() function entirely.
> > >
> > > This has absolutely no impact on architectures that do not support SCS
> > > (currently arm64 only).
> > >
> > > Signed-off-by: Will Deacon <will@...nel.org>
> >
> > Pulling this out of arch code seems sane to me, and the arch-specific
> > chanes look sound. However, I have a concern with the changes within the
> > scheduler context-switch.
> >
> > > diff --git a/arch/arm64/kernel/process.c b/arch/arm64/kernel/process.c
> > > index a35d3318492c..56be4cbf771f 100644
> > > --- a/arch/arm64/kernel/process.c
> > > +++ b/arch/arm64/kernel/process.c
> > > @@ -52,7 +52,6 @@
> > > #include <asm/mmu_context.h>
> > > #include <asm/processor.h>
> > > #include <asm/pointer_auth.h>
> > > -#include <asm/scs.h>
> > > #include <asm/stacktrace.h>
> > >
> > > #if defined(CONFIG_STACKPROTECTOR) && !defined(CONFIG_STACKPROTECTOR_PER_TASK)
> > > @@ -516,7 +515,6 @@ __notrace_funcgraph struct task_struct *__switch_to(struct task_struct *prev,
> > > entry_task_switch(next);
> > > uao_thread_switch(next);
> > > ssbs_thread_switch(next);
> > > - scs_overflow_check(next);
> >
> > Prior to this patch, we'd never switch to a task whose SCS had already
> > been corrupted.
> >
> > With this patch, we only check that when switching away from a task, and
> > only when CONFIG_SCHED_STACK_END_CHECK is selected, which at first
> > glance seems to weaken that.
>
> Yes, ignoring vmap'd stacks, this patch brings the SCS checking in-line with
> the main stack checking when CONFIG_SCHED_STACK_END_CHECK=y.
>
> > Arguably:
> >
> > * If the next task's SCS was corrupted by that task while it was
> > running, we had already lost at that point.
>
> With this change, we'll at least catch this one sooner, and that might be
> useful if a bug has caused us to overflow the SCS but not the main stack.
Sure, but only if CONFIG_SCHED_STACK_END_CHECK is selected.
> > * If the next task's SCS was corrupted by another task, then that could
> > also happen immediately after the check (though timing to avoid the
> > check but affect the process could be harder).
>
> We're only checking the magic end value, so the cross-task case is basically
> if you overrun your own SCS as above, but then continue to overrun entire
> SCSs for other tasks as well. It's probably not very useful in that case.
>
> > ... and a VMAP'd SCS would be much nicer in this regard.
> >
> > Do we think this is weakening the check, or do we think it wasn't all
> > that helpful to begin with?
>
> I see it as a debug check to catch SCS overflow, rather than a hardening
> feature, and I agree that using something like vmap stack for the SCS would
> be better because we could have a guard page instead.
Fair enough. Could we put something into the commit message that more
explicitly calls out debug-not-hardening? I agree that under that model
this patch looks fine, and with something to that effect:
Reviewed-by: Mark Rutland <mark.rutland@....com>
Mark.
> This is something I would like to revisit, but we need more
> information from Sami about why Android rejected the larger allocation
> size, since I don't think there's an awful lot of point merging this
> series if Android doesn't pick it up.
Indeed. I'd certainly prefer the robustness of a VMAP'd SCS if we can do
that.
Mark.
Powered by blists - more mailing lists