lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20170928135514.7wrsp4u53y3jkc3w@treble>
Date:   Thu, 28 Sep 2017 08:55:14 -0500
From:   Josh Poimboeuf <jpoimboe@...hat.com>
To:     Juergen Gross <jgross@...e.com>
Cc:     Andy Lutomirski <luto@...nel.org>,
        Linus Torvalds <torvalds@...ux-foundation.org>,
        "Levin, Alexander (Sasha Levin)" <alexander.levin@...izon.com>,
        "x86@...nel.org" <x86@...nel.org>,
        "linux-kernel@...r.kernel.org" <linux-kernel@...r.kernel.org>,
        "live-patching@...r.kernel.org" <live-patching@...r.kernel.org>,
        Jiri Slaby <jslaby@...e.cz>, Ingo Molnar <mingo@...nel.org>,
        "H. Peter Anvin" <hpa@...or.com>,
        Peter Zijlstra <peterz@...radead.org>,
        Mike Galbraith <efault@....de>
Subject: Re: [PATCH v4 1/2] x86/unwind: add ORC unwinder

On Thu, Sep 28, 2017 at 08:03:26AM +0200, Juergen Gross wrote:
> On 27/09/17 23:08, Josh Poimboeuf wrote:
> > On Tue, Aug 08, 2017 at 01:09:08PM -0700, Andy Lutomirski wrote:
> >> On Tue, Aug 8, 2017 at 12:13 PM, Josh Poimboeuf <jpoimboe@...hat.com> wrote:
> >>> On Tue, Aug 08, 2017 at 12:03:51PM -0700, Linus Torvalds wrote:
> >>>> On Tue, Aug 8, 2017 at 11:58 AM, Josh Poimboeuf <jpoimboe@...hat.com> wrote:
> >>>>>
> >>>>> Take for example the lock_is_held_type() function.  In vmlinux, it has
> >>>>> the following instruction:
> >>>>>
> >>>>>   callq *0xffffffff85a94880 (pv_irq_ops.save_fl)
> >>>>>
> >>>>> At runtime, that instruction is patched and replaced with a fast inline
> >>>>> version of arch_local_save_flags() which eliminates the call:
> >>>>>
> >>>>>   pushfq
> >>>>>   pop %rax
> >>>>>
> >>>>> The problem is when an interrupt hits after the push:
> >>>>>
> >>>>>   pushfq
> >>>>>   --- irq ---
> >>>>>   pop %rax
> >>>>
> >>>> That should actually be something easily fixable, for an odd reason:
> >>>> the instruction boundaries are different.
> >>>>
> >>>>> I'm not sure what the solution should be.  It will probably need to be
> >>>>> one of the following:
> >>>>>
> >>>>>   a) either don't allow runtime "alternative" patches to mess with the
> >>>>>      stack pointer (objtool could enforce this); or
> >>>>>
> >>>>>   b) come up with some way to register such patches with the ORC
> >>>>>      unwinder at runtime.
> >>>>
> >>>> c) just add ORC data for the alternative statically and _unconditionally_.
> >>>>
> >>>> No runtime registration. Just an unconditional entry for the
> >>>> particular IP that comes after the "pushfq". It cannot match the
> >>>> "callq" instruction, since it would be in the middle of that
> >>>> instruction.
> >>>>
> >>>> Basically, just do a "union" of the ORC data for all the alternatives.
> >>>>
> >>>> Now, objtool should still verify that the instruction pointers for
> >>>> alternatives are unique - or that they share the same ORC unwinder
> >>>> information if they are not.
> >>>>
> >>>> But in cases like this, when the instruction boundaires are different,
> >>>> things should "just work", with no need for any special cases.
> >>>>
> >>>> Hmm?
> >>>
> >>> Yeah, that might work.  Objtool already knows about alternatives, so it
> >>> might not be too hard.  I'll try it.
> >>
> >> But this one's not an actual alternative, right?  It's a pv op.
> >>
> >> I would advocate that we make it an alternative after all.  I frickin'
> >> hate the PV irq ops.  It would like roughly like this:
> >>
> >> ALTERNATIVE "pushfq; popq %rax", "callq *pv_irq_ops.save_fl",
> >> X86_FEATURE_GODDAMN_PV_IRQ_OPS
> >>
> >> (The obvious syntax error and the naming should probably be fixed.
> >> Also, this needs to live in an #ifdef because it needs to build on
> >> kernels with pv support.  It should also properly register itself as a
> >> pv patch site.)
> > 
> > I've got a prototype of the above working, where vmlinux shows:
> > 
> >   pushfq
> >   pop    %rax
> >   nop
> >   nop
> >   nop
> >   nop
> >   nop
> > 
> > instead of:
> > 
> >   callq  *0xffffffff81e3a400 (pv_irq_ops.save_fl)
> > 
> > Which is nice because the vmlinux disassembly now matches the most
> > common runtime cases (everything except Xen and vsmp).  And it also
> > fixes the upthread objtool issue.
> > 
> > The only slight issue with the patches is that hypervisors need access
> > to the pv ops much earlier than when alternatives are applied.  So I had
> > to add a new .pv_alternatives section for these pv ops alternatives, so
> > they can be patched very early, if running in a hypervisor.
> > 
> > Will clean up the code and post something relatively soon.
> > 
> 
> Are you combining alternatives and pvops then? I'm asking because in an
> up and running system under Xen the callq *... will be replaced with a
> much faster "call xen_save_fl". This should still be the case after
> your patch.

Right, it's not combining alternatives and pv ops, it's just adding
another step.  So first the

  pushfq; pop %rax

is replaced with

  callq *pv_irq_ops.save_fl

and then later, after the xen ops structs are finalized, it's replaced
with

  callq xen_save_fl

-- 
Josh

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ