[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <CALCETrXcTKB_j9MQC1mcZobKGt_cZ5ivDPjU3zwRBmj7DAUCsA@mail.gmail.com>
Date: Fri, 13 Nov 2020 09:34:57 -0800
From: Andy Lutomirski <luto@...nel.org>
To: Andrew Cooper <andrew.cooper3@...rix.com>
Cc: Josh Poimboeuf <jpoimboe@...hat.com>,
Peter Zijlstra <peterz@...radead.org>,
Shinichiro Kawasaki <shinichiro.kawasaki@....com>,
"linux-kernel@...r.kernel.org" <linux-kernel@...r.kernel.org>,
Nicholas Piggin <npiggin@...il.com>,
Damien Le Moal <Damien.LeMoal@....com>,
Juergen Gross <jgross@...e.com>, X86 ML <x86@...nel.org>
Subject: Re: WARNING: can't access registers at asm_common_interrupt
On Wed, Nov 11, 2020 at 12:25 PM Andrew Cooper
<andrew.cooper3@...rix.com> wrote:
>
> On 11/11/2020 20:15, Josh Poimboeuf wrote:
> > On Wed, Nov 11, 2020 at 09:07:30PM +0100, Peter Zijlstra wrote:
> >> On Wed, Nov 11, 2020 at 01:59:00PM -0600, Josh Poimboeuf wrote:
> >>> On Wed, Nov 11, 2020 at 08:42:06PM +0100, Peter Zijlstra wrote:
> >>>>> Would objtool have an easier time coping if this were implemented in
> >>>>> terms of a static call?
> >>>> I doubt it, the big problem is that there is no visibility into the
> >>>> actual alternative text. Runtime patching fragments into static call
> >>>> would have the exact same problem.
> >>>>
> >>>> Something that _might_ maybe work is trying to morph the immediate
> >>>> fragments into an alternative. That is, instead of this:
> >>>>
> >>>> static inline notrace unsigned long arch_local_save_flags(void)
> >>>> {
> >>>> return PVOP_CALLEE0(unsigned long, irq.save_fl);
> >>>> }
> >>>>
> >>>> Write it something like:
> >>>>
> >>>> static inline notrace unsigned long arch_local_save_flags(void)
> >>>> {
> >>>> PVOP_CALL_ARGS;
> >>>> PVOP_TEST_NULL(irq.save_fl);
> >>>> asm_inline volatile(ALTERNATIVE(paravirt_alt(PARAVIRT_CALL),
> >>>> "PUSHF; POP _ASM_AX",
> >>>> X86_FEATURE_NATIVE)
> >>>> : CLBR_RET_REG, ASM_CALL_CONSTRAINT
> >>>> : paravirt_type(irq.save_fl.func),
> >>>> paravirt_clobber(PVOP_CALLEE_CLOBBERS)
> >>>> : "memory", "cc");
> >>>> return __eax;
> >>>> }
> >>>>
> >>>> And then we have to teach objtool how to deal with conflicting
> >>>> alternatives...
> >>>>
> >>>> That would remove most (all, if we can figure out a form that deals with
> >>>> the spinlock fragments) of paravirt_patch.c
> >>>>
> >>>> Hmm?
> >>> I was going to suggest something similar. Though I would try to take it
> >>> further and replace paravirt_patch_default() with static calls.
> >> Possible, we just need to be _really_ careful to not allow changing
> >> those static_call()s. So maybe we need DEFINE_STATIC_CALL_RO() which
> >> does a __ro_after_init on the whole thing.
> > But what if you want to live migrate to another hypervisor ;-)
>
> The same as what happens currently. The user gets to keep all the
> resulting pieces ;)
>
> >>> Either way it doesn't make objtool's job much easier. But it would be
> >>> nice to consolidate runtime patching mechanisms and get rid of
> >>> .parainstructions.
> >> I think the above (combining alternative and paravirt/static_call) does
> >> make objtool's job easier, since then we at least have the actual
> >> alternative instructions available to inspect, or am I mis-understanding
> >> things?
> > Right, it makes objtool's job a _little_ easier, since it already knows
> > how to read alternatives. But it still has to learn to deal with the
> > conflicting stack layouts.
>
> I suppose the needed abstraction is "these blocks will start and end
> with the same stack layout", while allowing the internals to diverge.
>
How much of this stuff is actually useful anymore? I'm wondering if
we can move most or all of this crud to
cpu_feature_enabled(X86_FEATURE_XENPV) and its asm equivalent. The
full list, annotated, appears to be:
const unsigned char irq_irq_disable[1];
This is CLI or CALL, right?
const unsigned char irq_irq_enable[1];
STI or CALL.
const unsigned char irq_save_fl[2];
PUSHF; POP %r/eax. I *think* I read the paravirt mess correctly and
this also turns into CALL.
const unsigned char mmu_read_cr2[3];
const unsigned char mmu_read_cr3[3];
const unsigned char mmu_write_cr3[3];
The write CR3 is so slow that I can't imagine us caring. Reading CR3
should already be fairly optimized because it's slow on old non-PV
hypervisors, too. Reading CR2 is rare and lives in asm. These also
appear to just switch between MOV and CALL, anyway.
const unsigned char irq_restore_fl[2];
Ugh, this one sucks. IMO it should be, for native and PV:
if (flags & X86_EFLAGS_IF) {
local_irq_enable(); /* or raw? */
} else {
if (some debugging option) {
WARN_ON_ONCE(save_fl() & X86_EFLAGS_IF);
}
}
POPF is slooooow.
const unsigned char cpu_wbinvd[2];
This is hilariously slow no matter what. static_call() or even just a
plain old indirect call should be fine.
const unsigned char cpu_usergs_sysret64[6];
This is in the asm and we shouldn't be doing it at all for Xen PV.
IOW we should just drop this patch site entirely. I can possibly find
some time to get rid of it, and maybe someone from Xen land can help.
I bet that we can gain a lot of perf on Xen PV by cleaning this up,
and I bet it will simplify everything.
const unsigned char cpu_swapgs[3];
This is SWAPGS or nop, unless I've missed some subtlety.
const unsigned char mov64[3];
This is some PTE magic, and I haven't deciphered it yet.
So I think there is at most one of these that wants anything more
complicated than a plain ALTERNATIVE. Any volunteers to make it so?
Juergen, if you do all of them except USERGS_SYSRET64, I hereby
volunteer to do that one.
BTW, if y'all want to live migrate between Xen PV and anything else,
you are nuts.
Powered by blists - more mailing lists