lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20201111194206.GK2628@hirez.programming.kicks-ass.net>
Date:   Wed, 11 Nov 2020 20:42:06 +0100
From:   Peter Zijlstra <peterz@...radead.org>
To:     Andrew Cooper <andrew.cooper3@...rix.com>
Cc:     Josh Poimboeuf <jpoimboe@...hat.com>,
        Shinichiro Kawasaki <shinichiro.kawasaki@....com>,
        "linux-kernel@...r.kernel.org" <linux-kernel@...r.kernel.org>,
        Nicholas Piggin <npiggin@...il.com>,
        Damien Le Moal <Damien.LeMoal@....com>, jgross@...e.com,
        x86@...nel.org
Subject: Re: WARNING: can't access registers at asm_common_interrupt

On Wed, Nov 11, 2020 at 06:46:37PM +0000, Andrew Cooper wrote:

> Well...
> 
> static_calls are a newer, and more generic, form of pvops.  Most of the
> magic is to do with inlining small fragments, but static calls can do
> that now too, IIRC?

If you're referring to this glorious hack:

  https://lkml.kernel.org/r/20201110101307.GO2651@hirez.programming.kicks-ass.net

that only 'works' because it's a single instruction. That is,
static_call can only poke single instructions. They cannot replace a
call with "PUSHF; POP" / "PUSH; POPF" for example. They also cannot do
NOP padding for 'short' sequences.

Paravirt, like alternatives, are special in that they only happen once,
before SMP bringup.

> >> Something really disguisting we could do is recognise the indirect call
> >> offset and emit an extra ORC entry for RIP+1. So the cases are:
> >>
> >> 	CALL *pv_ops.save_fl	-- 7 bytes IIRC
> >> 	CALL $imm;		-- 5 bytes
> >> 	PUSHF; POP %[RE]AX	-- 2 bytes
> >>
> >> so the RIP+1 (the POP insn) will only ever exist in this case. The
> >> indirect and direct call cases would never land on that IP.
> > I had a similar idea, and a bit of deja vu - we may have talked about
> > this before.  At least I know we talked about doing something similar
> > for alternatives which muck with the stack.

Vague memories... luckily we managed to get alternatives to a state
where they match, which is much saner.

> The main complexity with pvops is that the
> 
>     CALL *pv_ops.save_fl
> 
> form needs to be usable from extremely early in the day (pre general
> patching), hence the use of function pointers and some non-standard ABIs.

The performance rasins mentioned below are a large part of the
non-standard ABI (eg CALLEE_SAVE)

> For performance reasons, the end result of this pvop wants to be `pushf;
> pop %[re]ax` in then native case, and `call xen_pv_save_fl` in the Xen
> case, but this doesn't mean that the compiled instruction needs to be a
> function pointer to begin with.

Not sure emitting the native code would be feasible.. also
cpu_usergs_sysret64 is 6 bytes.

> Would objtool have an easier time coping if this were implemented in
> terms of a static call?

I doubt it, the big problem is that there is no visibility into the
actual alternative text. Runtime patching fragments into static call
would have the exact same problem.

Something that _might_ maybe work is trying to morph the immediate
fragments into an alternative. That is, instead of this:

static inline notrace unsigned long arch_local_save_flags(void)
{
	return PVOP_CALLEE0(unsigned long, irq.save_fl);
}

Write it something like:

static inline notrace unsigned long arch_local_save_flags(void)
{
	PVOP_CALL_ARGS;
	PVOP_TEST_NULL(irq.save_fl);
	asm_inline volatile(ALTERNATIVE(paravirt_alt(PARAVIRT_CALL),
					"PUSHF; POP _ASM_AX",
					X86_FEATURE_NATIVE)
			    : CLBR_RET_REG, ASM_CALL_CONSTRAINT
			    : paravirt_type(irq.save_fl.func),
			      paravirt_clobber(PVOP_CALLEE_CLOBBERS)
			    : "memory", "cc");
	return __eax;
}

And then we have to teach objtool how to deal with conflicting
alternatives...

That would remove most (all, if we can figure out a form that deals with
the spinlock fragments) of paravirt_patch.c

Hmm?

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ