[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <1c189bd3-e604-3ba1-6827-21510e4f3cfe@suse.com>
Date: Wed, 9 Aug 2017 19:52:51 +0200
From: Juergen Gross <jgross@...e.com>
To: Andy Lutomirski <luto@...nel.org>
Cc: Josh Poimboeuf <jpoimboe@...hat.com>,
Linus Torvalds <torvalds@...ux-foundation.org>,
"Levin, Alexander (Sasha Levin)" <alexander.levin@...izon.com>,
"x86@...nel.org" <x86@...nel.org>,
"linux-kernel@...r.kernel.org" <linux-kernel@...r.kernel.org>,
"live-patching@...r.kernel.org" <live-patching@...r.kernel.org>,
Jiri Slaby <jslaby@...e.cz>, Ingo Molnar <mingo@...nel.org>,
"H. Peter Anvin" <hpa@...or.com>,
Peter Zijlstra <peterz@...radead.org>,
Mike Galbraith <efault@....de>
Subject: Re: [PATCH v4 1/2] x86/unwind: add ORC unwinder
On 09/08/17 18:11, Andy Lutomirski wrote:
> On Wed, Aug 9, 2017 at 1:49 AM, Juergen Gross <jgross@...e.com> wrote:
>> On 08/08/17 22:09, Andy Lutomirski wrote:
>>> On Tue, Aug 8, 2017 at 12:13 PM, Josh Poimboeuf <jpoimboe@...hat.com> wrote:
>>>> On Tue, Aug 08, 2017 at 12:03:51PM -0700, Linus Torvalds wrote:
>>>>> On Tue, Aug 8, 2017 at 11:58 AM, Josh Poimboeuf <jpoimboe@...hat.com> wrote:
>>>>>>
>>>>>> Take for example the lock_is_held_type() function. In vmlinux, it has
>>>>>> the following instruction:
>>>>>>
>>>>>> callq *0xffffffff85a94880 (pv_irq_ops.save_fl)
>>>>>>
>>>>>> At runtime, that instruction is patched and replaced with a fast inline
>>>>>> version of arch_local_save_flags() which eliminates the call:
>>>>>>
>>>>>> pushfq
>>>>>> pop %rax
>>>>>>
>>>>>> The problem is when an interrupt hits after the push:
>>>>>>
>>>>>> pushfq
>>>>>> --- irq ---
>>>>>> pop %rax
>>>>>
>>>>> That should actually be something easily fixable, for an odd reason:
>>>>> the instruction boundaries are different.
>>>>>
>>>>>> I'm not sure what the solution should be. It will probably need to be
>>>>>> one of the following:
>>>>>>
>>>>>> a) either don't allow runtime "alternative" patches to mess with the
>>>>>> stack pointer (objtool could enforce this); or
>>>>>>
>>>>>> b) come up with some way to register such patches with the ORC
>>>>>> unwinder at runtime.
>>>>>
>>>>> c) just add ORC data for the alternative statically and _unconditionally_.
>>>>>
>>>>> No runtime registration. Just an unconditional entry for the
>>>>> particular IP that comes after the "pushfq". It cannot match the
>>>>> "callq" instruction, since it would be in the middle of that
>>>>> instruction.
>>>>>
>>>>> Basically, just do a "union" of the ORC data for all the alternatives.
>>>>>
>>>>> Now, objtool should still verify that the instruction pointers for
>>>>> alternatives are unique - or that they share the same ORC unwinder
>>>>> information if they are not.
>>>>>
>>>>> But in cases like this, when the instruction boundaires are different,
>>>>> things should "just work", with no need for any special cases.
>>>>>
>>>>> Hmm?
>>>>
>>>> Yeah, that might work. Objtool already knows about alternatives, so it
>>>> might not be too hard. I'll try it.
>>>
>>> But this one's not an actual alternative, right? It's a pv op.
>>>
>>> I would advocate that we make it an alternative after all. I frickin'
>>> hate the PV irq ops. It would like roughly like this:
>>>
>>> ALTERNATIVE "pushfq; popq %rax", "callq *pv_irq_ops.save_fl",
>>> X86_FEATURE_GODDAMN_PV_IRQ_OPS
>>
>> You are aware that at least some of the Xen irq pvops functionality is
>> patched inline? Your modification would slow down pv guests quite a
>> bit, I guess.
>
> Yes, but what I had in mind was having both the alternative *and* the
> paravirt patch entry. We'd obviously have to make sure to run
> alternatives before paravirt patching, but that's possibly already the
> case.
So instead of having the "callq *pv_irq_ops.save_fl" as initial code you
would end up with the "pushfq; popq %rax" until the alternatives are
applied.
I don't think this will work. In the end you are not allowed to use any
irq ops in a Xen guest until that happens. And I think it happens rather
late compared to the usage of any irq ops.
And in case you are swapping oldinstr and newinstr in above ALTERNATIVE
usage you will end up with exactly the same as with today's pvops, just
the patching mechanism for the bare metal case would be different. And
you would need more table entries (pvops and alternatives) for the same
functionality.
Or do I miss something here?
Juergen
Powered by blists - more mailing lists