[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <7afed057-111a-9a21-c114-44987184b683@suse.com>
Date: Sat, 14 Nov 2020 10:16:18 +0100
From: Jürgen Groß <jgross@...e.com>
To: Andy Lutomirski <luto@...nel.org>,
Andrew Cooper <andrew.cooper3@...rix.com>
Cc: Josh Poimboeuf <jpoimboe@...hat.com>,
Peter Zijlstra <peterz@...radead.org>,
Shinichiro Kawasaki <shinichiro.kawasaki@....com>,
"linux-kernel@...r.kernel.org" <linux-kernel@...r.kernel.org>,
Nicholas Piggin <npiggin@...il.com>,
Damien Le Moal <Damien.LeMoal@....com>,
X86 ML <x86@...nel.org>
Subject: Re: WARNING: can't access registers at asm_common_interrupt
On 13.11.20 18:34, Andy Lutomirski wrote:
> On Wed, Nov 11, 2020 at 12:25 PM Andrew Cooper
> <andrew.cooper3@...rix.com> wrote:
>>
>> On 11/11/2020 20:15, Josh Poimboeuf wrote:
>>> On Wed, Nov 11, 2020 at 09:07:30PM +0100, Peter Zijlstra wrote:
>>>> On Wed, Nov 11, 2020 at 01:59:00PM -0600, Josh Poimboeuf wrote:
>>>>> On Wed, Nov 11, 2020 at 08:42:06PM +0100, Peter Zijlstra wrote:
>>>>>>> Would objtool have an easier time coping if this were implemented in
>>>>>>> terms of a static call?
>>>>>> I doubt it, the big problem is that there is no visibility into the
>>>>>> actual alternative text. Runtime patching fragments into static call
>>>>>> would have the exact same problem.
>>>>>>
>>>>>> Something that _might_ maybe work is trying to morph the immediate
>>>>>> fragments into an alternative. That is, instead of this:
>>>>>>
>>>>>> static inline notrace unsigned long arch_local_save_flags(void)
>>>>>> {
>>>>>> return PVOP_CALLEE0(unsigned long, irq.save_fl);
>>>>>> }
>>>>>>
>>>>>> Write it something like:
>>>>>>
>>>>>> static inline notrace unsigned long arch_local_save_flags(void)
>>>>>> {
>>>>>> PVOP_CALL_ARGS;
>>>>>> PVOP_TEST_NULL(irq.save_fl);
>>>>>> asm_inline volatile(ALTERNATIVE(paravirt_alt(PARAVIRT_CALL),
>>>>>> "PUSHF; POP _ASM_AX",
>>>>>> X86_FEATURE_NATIVE)
>>>>>> : CLBR_RET_REG, ASM_CALL_CONSTRAINT
>>>>>> : paravirt_type(irq.save_fl.func),
>>>>>> paravirt_clobber(PVOP_CALLEE_CLOBBERS)
>>>>>> : "memory", "cc");
>>>>>> return __eax;
>>>>>> }
>>>>>>
>>>>>> And then we have to teach objtool how to deal with conflicting
>>>>>> alternatives...
>>>>>>
>>>>>> That would remove most (all, if we can figure out a form that deals with
>>>>>> the spinlock fragments) of paravirt_patch.c
>>>>>>
>>>>>> Hmm?
>>>>> I was going to suggest something similar. Though I would try to take it
>>>>> further and replace paravirt_patch_default() with static calls.
>>>> Possible, we just need to be _really_ careful to not allow changing
>>>> those static_call()s. So maybe we need DEFINE_STATIC_CALL_RO() which
>>>> does a __ro_after_init on the whole thing.
>>> But what if you want to live migrate to another hypervisor ;-)
>>
>> The same as what happens currently. The user gets to keep all the
>> resulting pieces ;)
>>
>>>>> Either way it doesn't make objtool's job much easier. But it would be
>>>>> nice to consolidate runtime patching mechanisms and get rid of
>>>>> .parainstructions.
>>>> I think the above (combining alternative and paravirt/static_call) does
>>>> make objtool's job easier, since then we at least have the actual
>>>> alternative instructions available to inspect, or am I mis-understanding
>>>> things?
>>> Right, it makes objtool's job a _little_ easier, since it already knows
>>> how to read alternatives. But it still has to learn to deal with the
>>> conflicting stack layouts.
>>
>> I suppose the needed abstraction is "these blocks will start and end
>> with the same stack layout", while allowing the internals to diverge.
>>
>
> How much of this stuff is actually useful anymore? I'm wondering if
> we can move most or all of this crud to
> cpu_feature_enabled(X86_FEATURE_XENPV) and its asm equivalent. The
> full list, annotated, appears to be:
>
> const unsigned char irq_irq_disable[1];
>
> This is CLI or CALL, right?
Yes.
>
> const unsigned char irq_irq_enable[1];
>
> STI or CALL.
Yes.
>
> const unsigned char irq_save_fl[2];
>
> PUSHF; POP %r/eax. I *think* I read the paravirt mess correctly and
> this also turns into CALL.
It does.
>
> const unsigned char mmu_read_cr2[3];
> const unsigned char mmu_read_cr3[3];
> const unsigned char mmu_write_cr3[3];
>
> The write CR3 is so slow that I can't imagine us caring. Reading CR3
> should already be fairly optimized because it's slow on old non-PV
> hypervisors, too. Reading CR2 is rare and lives in asm. These also
> appear to just switch between MOV and CALL, anyway.
Correct.
>
> const unsigned char irq_restore_fl[2];
>
> Ugh, this one sucks. IMO it should be, for native and PV:
>
> if (flags & X86_EFLAGS_IF) {
> local_irq_enable(); /* or raw? */
> } else {
> if (some debugging option) {
> WARN_ON_ONCE(save_fl() & X86_EFLAGS_IF);
> }
> }
Seems sensible.
>
> POPF is slooooow.
>
> const unsigned char cpu_wbinvd[2];
>
> This is hilariously slow no matter what. static_call() or even just a
> plain old indirect call should be fine.
I'd go with the static_call().
>
> const unsigned char cpu_usergs_sysret64[6];
>
> This is in the asm and we shouldn't be doing it at all for Xen PV.
> IOW we should just drop this patch site entirely. I can possibly find
> some time to get rid of it, and maybe someone from Xen land can help.
> I bet that we can gain a lot of perf on Xen PV by cleaning this up,
> and I bet it will simplify everything.
>
> const unsigned char cpu_swapgs[3];
>
> This is SWAPGS or nop, unless I've missed some subtlety.
>
> const unsigned char mov64[3];
>
> This is some PTE magic, and I haven't deciphered it yet.
Either a mov or a call.
>
> So I think there is at most one of these that wants anything more
> complicated than a plain ALTERNATIVE. Any volunteers to make it so?
> Juergen, if you do all of them except USERGS_SYSRET64, I hereby
> volunteer to do that one.
Why is a plain alternative (either swapgs; sysretq or a jmp xen_sysret64
depending on X86_FEATURE_XENPV) no option?
Its not as if this code would run before alternative patching.
>
> BTW, if y'all want to live migrate between Xen PV and anything else,
> you are nuts.
>
That's no option. Xen PV is a guest property, not one of the hypervisor.
Juergen
Download attachment "OpenPGP_0xB0DE9DD628BF132F.asc" of type "application/pgp-keys" (3092 bytes)
Download attachment "OpenPGP_signature" of type "application/pgp-signature" (496 bytes)
Powered by blists - more mailing lists