lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <alpine.LSU.2.21.2003161642450.15518@pobox.suse.cz>
Date:   Mon, 16 Mar 2020 16:51:12 +0100 (CET)
From:   Miroslav Benes <mbenes@...e.cz>
To:     jpoimboe@...hat.com,
        Jürgen Groß <jgross@...e.com>
cc:     boris.ostrovsky@...cle.com, sstabellini@...nel.org,
        tglx@...utronix.de, mingo@...hat.com, bp@...en8.de, hpa@...or.com,
        x86@...nel.org, xen-devel@...ts.xenproject.org,
        linux-kernel@...r.kernel.org, live-patching@...r.kernel.org,
        jslaby@...e.cz
Subject: Re: [RFC PATCH 2/2] x86/xen: Make the secondary CPU idle tasks
 reliable

On Fri, 13 Mar 2020, Miroslav Benes wrote:

> On Fri, 13 Mar 2020, Jürgen Groß wrote:
> 
> > On 12.03.20 15:20, Miroslav Benes wrote:
> > > The unwinder reports the secondary CPU idle tasks' stack on XEN PV as
> > > unreliable, which affects at least live patching.
> > > cpu_initialize_context() sets up the context of the CPU through
> > > VCPUOP_initialise hypercall. After it is woken up, the idle task starts
> > > in cpu_bringup_and_idle() function and its stack starts at the offset
> > > right below pt_regs. The unwinder correctly detects the end of stack
> > > there but it is confused by NULL return address in the last frame.
> > > 
> > > RFC: I haven't found the way to teach the unwinder about the state of
> > > the stack there. Thus the ugly hack using assembly. Similar to what
> > > startup_xen() has got for boot CPU.
> > > 
> > > It introduces objtool "unreachable instruction" warning just right after
> > > the jump to cpu_bringup_and_idle(). It should show the idea what needs
> > > to be done though, I think. Ideas welcome.
> > > 
> > > Signed-off-by: Miroslav Benes <mbenes@...e.cz>
> > > ---
> > >   arch/x86/xen/smp_pv.c   |  3 ++-
> > >   arch/x86/xen/xen-head.S | 10 ++++++++++
> > >   2 files changed, 12 insertions(+), 1 deletion(-)
> > > 
> > > diff --git a/arch/x86/xen/smp_pv.c b/arch/x86/xen/smp_pv.c
> > > index 802ee5bba66c..6b88cdcbef8f 100644
> > > --- a/arch/x86/xen/smp_pv.c
> > > +++ b/arch/x86/xen/smp_pv.c
> > > @@ -53,6 +53,7 @@ static DEFINE_PER_CPU(struct xen_common_irq, xen_irq_work)
> > > = { .irq = -1 };
> > >   static DEFINE_PER_CPU(struct xen_common_irq, xen_pmu_irq) = { .irq = -1 };
> > >   
> > >   static irqreturn_t xen_irq_work_interrupt(int irq, void *dev_id);
> > > +extern unsigned char asm_cpu_bringup_and_idle[];
> > >   
> > >   static void cpu_bringup(void)
> > >   {
> > 
> > Would adding this here work?
> > 
> > +	asm volatile (UNWIND_HINT(ORC_REG_UNDEFINED, 0, ORC_TYPE_CALL, 1));
> 
> I tried something similar. It did not work, because than the hint is 
> "bound" to the closest next call in the function which is cr4_init() in 
> this case. The unwinder would not take it into account.
> 
> In my case, I placed it at the beginning of cpu_bringup_and_idle(). I also 
> open coded it and played with the offset in the orc entry, but that did 
> not work for some other reason.
> 
> However, now I tried this
> 
> diff --git a/arch/x86/xen/smp_pv.c b/arch/x86/xen/smp_pv.c
> index 6b88cdcbef8f..39afd88309cb 100644
> --- a/arch/x86/xen/smp_pv.c
> +++ b/arch/x86/xen/smp_pv.c
> @@ -92,6 +92,7 @@ asmlinkage __visible void cpu_bringup_and_idle(void)
>  {
>         cpu_bringup();
>         boot_init_stack_canary();
> +       asm volatile (UNWIND_HINT(ORC_REG_UNDEFINED, 0, ORC_TYPE_CALL, 1));
>         cpu_startup_entry(CPUHP_AP_ONLINE_IDLE);
>  }
> 
> and that seems to work. I need to properly verify and test, but the 
> explanation is that as opposed to the above, cpu_startup_entry() is on the 
> idle task's stack and the hint is then taken into account. The unwound 
> stack seems to be complete, so it could indeed be the fix.

Not the correct one though. Objtool rightfully complains with

arch/x86/xen/smp_pv.o: warning: objtool: cpu_bringup_and_idle()+0x6a: undefined stack state

and all the other hacks I tried ended up in the same dead alley. It seems 
to me the correct fix is that all orc entries for cpu_bringup_and_idle() 
should have "end" property set to 1, since it is the first function on the 
stack. I don't know how to achieve that without the assembly hack in the 
patch I sent. If I am not missing something, of course.

Josh, any idea?

Thanks
Miroslav

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ