[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Message-ID: <CAC_TJvcWkieK9XQeKi4-nB1ijUZs0csd3wAaWpRE9a375Zx=qw@mail.gmail.com>
Date: Wed, 20 Apr 2022 14:51:12 -0700
From: Kalesh Singh <kaleshsingh@...gle.com>
To: Marc Zyngier <maz@...nel.org>
Cc: Will Deacon <will@...nel.org>, Quentin Perret <qperret@...gle.com>,
Fuad Tabba <tabba@...gle.com>,
Suren Baghdasaryan <surenb@...gle.com>,
"Cc: Android Kernel" <kernel-team@...roid.com>,
James Morse <james.morse@....com>,
Alexandru Elisei <alexandru.elisei@....com>,
Suzuki K Poulose <suzuki.poulose@....com>,
Catalin Marinas <catalin.marinas@....com>,
Andrew Walbran <qwandor@...gle.com>,
Mark Rutland <mark.rutland@....com>,
Ard Biesheuvel <ardb@...nel.org>,
Zenghui Yu <yuzenghui@...wei.com>,
Andrew Jones <drjones@...hat.com>,
Nathan Chancellor <nathan@...nel.org>,
Masahiro Yamada <masahiroy@...nel.org>,
"moderated list:ARM64 PORT (AARCH64 ARCHITECTURE)"
<linux-arm-kernel@...ts.infradead.org>,
kvmarm <kvmarm@...ts.cs.columbia.edu>,
LKML <linux-kernel@...r.kernel.org>
Subject: Re: [PATCH v7 5/6] KVM: arm64: Detect and handle hypervisor stack overflows
On Mon, Apr 18, 2022 at 7:41 PM Kalesh Singh <kaleshsingh@...gle.com> wrote:
>
> On Mon, Apr 18, 2022 at 3:09 AM Marc Zyngier <maz@...nel.org> wrote:
> >
> > On Fri, 08 Apr 2022 21:03:28 +0100,
> > Kalesh Singh <kaleshsingh@...gle.com> wrote:
> > >
> > > The hypervisor stacks (for both nVHE Hyp mode and nVHE protected mode)
> > > are aligned such that any valid stack address has PAGE_SHIFT bit as 1.
> > > This allows us to conveniently check for overflow in the exception entry
> > > without corrupting any GPRs. We won't recover from a stack overflow so
> > > panic the hypervisor.
> > >
> > > Signed-off-by: Kalesh Singh <kaleshsingh@...gle.com>
> > > Tested-by: Fuad Tabba <tabba@...gle.com>
> > > Reviewed-by: Fuad Tabba <tabba@...gle.com>
> > > ---
> > >
> > > Changes in v7:
> > > - Add Fuad's Reviewed-by and Tested-by tags.
> > >
> > > Changes in v5:
> > > - Valid stack addresses now have PAGE_SHIFT bit as 1 instead of 0
> > >
> > > Changes in v3:
> > > - Remove test_sp_overflow macro, per Mark
> > > - Add asmlinkage attribute for hyp_panic, hyp_panic_bad_stack, per Ard
> > >
> > >
> > > arch/arm64/kvm/hyp/nvhe/host.S | 24 ++++++++++++++++++++++++
> > > arch/arm64/kvm/hyp/nvhe/switch.c | 7 ++++++-
> > > 2 files changed, 30 insertions(+), 1 deletion(-)
> > >
> > > diff --git a/arch/arm64/kvm/hyp/nvhe/host.S b/arch/arm64/kvm/hyp/nvhe/host.S
> > > index 3d613e721a75..be6d844279b1 100644
> > > --- a/arch/arm64/kvm/hyp/nvhe/host.S
> > > +++ b/arch/arm64/kvm/hyp/nvhe/host.S
> > > @@ -153,6 +153,18 @@ SYM_FUNC_END(__host_hvc)
> > >
> > > .macro invalid_host_el2_vect
> > > .align 7
> > > +
> > > + /*
> > > + * Test whether the SP has overflowed, without corrupting a GPR.
> > > + * nVHE hypervisor stacks are aligned so that the PAGE_SHIFT bit
> > > + * of SP should always be 1.
> > > + */
> > > + add sp, sp, x0 // sp' = sp + x0
> > > + sub x0, sp, x0 // x0' = sp' - x0 = (sp + x0) - x0 = sp
> > > + tbz x0, #PAGE_SHIFT, .L__hyp_sp_overflow\@
> > > + sub x0, sp, x0 // x0'' = sp' - x0' = (sp + x0) - sp = x0
> > > + sub sp, sp, x0 // sp'' = sp' - x0 = (sp + x0) - x0 = sp
> > > +
> > > /* If a guest is loaded, panic out of it. */
> > > stp x0, x1, [sp, #-16]!
> > > get_loaded_vcpu x0, x1
> > > @@ -165,6 +177,18 @@ SYM_FUNC_END(__host_hvc)
> > > * been partially clobbered by __host_enter.
> > > */
> > > b hyp_panic
> > > +
> > > +.L__hyp_sp_overflow\@:
> > > + /*
> > > + * Reset SP to the top of the stack, to allow handling the hyp_panic.
> > > + * This corrupts the stack but is ok, since we won't be attempting
> > > + * any unwinding here.
> > > + */
> > > + ldr_this_cpu x0, kvm_init_params + NVHE_INIT_STACK_HYP_VA, x1
> > > + mov sp, x0
> > > +
> > > + bl hyp_panic_bad_stack
> >
> > Why bl? You clearly don't expect to return here, given that you have
> > an ASM_BUG() right below, and that you are calling a __no_return
> > function. I think we should be consistent with the rest of the code
> > and just do a simple branch.
>
> The idea was to use bl to give the hyp_panic_bad_stack() frame in the
> stack trace, which makes it easy to identify overflows. I can add a
> comment and drop the redundant ASM_BUG()
Sorry, my mistake here: bl will give us the current frame in the stack
trace (hyp_host_vector) so it doesn't affect hyp_panic_bad_stack (next
frame) being in the strace trace. Addressed in v8:
https://lore.kernel.org/r/20220420214317.3303360-6-kaleshsingh@google.com/
Thanks,
Kalesh
>
> Thanks,
> Kalesh
>
> >
> > It also gives us a chance to preserve an extra register from the
> > context.
> >
> > > + ASM_BUG()
> > > .endm
> > >
> > > .macro invalid_host_el1_vect
> > > diff --git a/arch/arm64/kvm/hyp/nvhe/switch.c b/arch/arm64/kvm/hyp/nvhe/switch.c
> > > index 6410d21d8695..703a5d3f611b 100644
> > > --- a/arch/arm64/kvm/hyp/nvhe/switch.c
> > > +++ b/arch/arm64/kvm/hyp/nvhe/switch.c
> > > @@ -347,7 +347,7 @@ int __kvm_vcpu_run(struct kvm_vcpu *vcpu)
> > > return exit_code;
> > > }
> > >
> > > -void __noreturn hyp_panic(void)
> > > +asmlinkage void __noreturn hyp_panic(void)
> > > {
> > > u64 spsr = read_sysreg_el2(SYS_SPSR);
> > > u64 elr = read_sysreg_el2(SYS_ELR);
> > > @@ -369,6 +369,11 @@ void __noreturn hyp_panic(void)
> > > unreachable();
> > > }
> > >
> > > +asmlinkage void __noreturn hyp_panic_bad_stack(void)
> > > +{
> > > + hyp_panic();
> > > +}
> > > +
> > > asmlinkage void kvm_unexpected_el2_exception(void)
> > > {
> > > return __kvm_unexpected_el2_exception();
> > > --
> > > 2.35.1.1178.g4f1659d476-goog
> > >
> > >
> >
> > Thanks,
> >
> > M.
> >
> > --
> > Without deviation from the norm, progress is not possible.
Powered by blists - more mailing lists