[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20201111194206.GK2628@hirez.programming.kicks-ass.net>
Date: Wed, 11 Nov 2020 20:42:06 +0100
From: Peter Zijlstra <peterz@...radead.org>
To: Andrew Cooper <andrew.cooper3@...rix.com>
Cc: Josh Poimboeuf <jpoimboe@...hat.com>,
Shinichiro Kawasaki <shinichiro.kawasaki@....com>,
"linux-kernel@...r.kernel.org" <linux-kernel@...r.kernel.org>,
Nicholas Piggin <npiggin@...il.com>,
Damien Le Moal <Damien.LeMoal@....com>, jgross@...e.com,
x86@...nel.org
Subject: Re: WARNING: can't access registers at asm_common_interrupt
On Wed, Nov 11, 2020 at 06:46:37PM +0000, Andrew Cooper wrote:
> Well...
>
> static_calls are a newer, and more generic, form of pvops. Most of the
> magic is to do with inlining small fragments, but static calls can do
> that now too, IIRC?
If you're referring to this glorious hack:
https://lkml.kernel.org/r/20201110101307.GO2651@hirez.programming.kicks-ass.net
that only 'works' because it's a single instruction. That is,
static_call can only poke single instructions. They cannot replace a
call with "PUSHF; POP" / "PUSH; POPF" for example. They also cannot do
NOP padding for 'short' sequences.
Paravirt, like alternatives, are special in that they only happen once,
before SMP bringup.
> >> Something really disguisting we could do is recognise the indirect call
> >> offset and emit an extra ORC entry for RIP+1. So the cases are:
> >>
> >> CALL *pv_ops.save_fl -- 7 bytes IIRC
> >> CALL $imm; -- 5 bytes
> >> PUSHF; POP %[RE]AX -- 2 bytes
> >>
> >> so the RIP+1 (the POP insn) will only ever exist in this case. The
> >> indirect and direct call cases would never land on that IP.
> > I had a similar idea, and a bit of deja vu - we may have talked about
> > this before. At least I know we talked about doing something similar
> > for alternatives which muck with the stack.
Vague memories... luckily we managed to get alternatives to a state
where they match, which is much saner.
> The main complexity with pvops is that the
>
> CALL *pv_ops.save_fl
>
> form needs to be usable from extremely early in the day (pre general
> patching), hence the use of function pointers and some non-standard ABIs.
The performance rasins mentioned below are a large part of the
non-standard ABI (eg CALLEE_SAVE)
> For performance reasons, the end result of this pvop wants to be `pushf;
> pop %[re]ax` in then native case, and `call xen_pv_save_fl` in the Xen
> case, but this doesn't mean that the compiled instruction needs to be a
> function pointer to begin with.
Not sure emitting the native code would be feasible.. also
cpu_usergs_sysret64 is 6 bytes.
> Would objtool have an easier time coping if this were implemented in
> terms of a static call?
I doubt it, the big problem is that there is no visibility into the
actual alternative text. Runtime patching fragments into static call
would have the exact same problem.
Something that _might_ maybe work is trying to morph the immediate
fragments into an alternative. That is, instead of this:
static inline notrace unsigned long arch_local_save_flags(void)
{
return PVOP_CALLEE0(unsigned long, irq.save_fl);
}
Write it something like:
static inline notrace unsigned long arch_local_save_flags(void)
{
PVOP_CALL_ARGS;
PVOP_TEST_NULL(irq.save_fl);
asm_inline volatile(ALTERNATIVE(paravirt_alt(PARAVIRT_CALL),
"PUSHF; POP _ASM_AX",
X86_FEATURE_NATIVE)
: CLBR_RET_REG, ASM_CALL_CONSTRAINT
: paravirt_type(irq.save_fl.func),
paravirt_clobber(PVOP_CALLEE_CLOBBERS)
: "memory", "cc");
return __eax;
}
And then we have to teach objtool how to deal with conflicting
alternatives...
That would remove most (all, if we can figure out a form that deals with
the spinlock fragments) of paravirt_patch.c
Hmm?
Powered by blists - more mailing lists