[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Date: Wed, 11 Nov 2020 12:13:28 -0600
From: Josh Poimboeuf <jpoimboe@...hat.com>
To: Peter Zijlstra <peterz@...radead.org>
Cc: Shinichiro Kawasaki <shinichiro.kawasaki@....com>,
"linux-kernel@...r.kernel.org" <linux-kernel@...r.kernel.org>,
Nicholas Piggin <npiggin@...il.com>,
Damien Le Moal <Damien.LeMoal@....com>,
andrew.cooper3@...rix.com, jgross@...e.com
Subject: Re: WARNING: can't access registers at asm_common_interrupt
On Wed, Nov 11, 2020 at 06:47:36PM +0100, Peter Zijlstra wrote:
> This is PARAVIRT_XXL only, which is a Xen special. My preference, as
> always, is to kill it... Sadly the Xen people have a different opinion.
That would be soooo nice... then we could get rid of paravirt patching
altogether and replace it with static calls.
> > Objtool doesn't know about the pushf/pop paravirt patch, so ORC gets
> > confused by the changed stack layout.
> >
> > I'm thinking we either need to teach objtool how to deal with
> > save_fl/restore_fl patches, or we need to just get rid of those nasty
> > patches somehow. Peter, any thoughts?
>
> Don't use Xen? ;-)
>
> So with PARAVIRT_XXL the compiler will emit something like:
>
> "CALL *pvops.save_fl"
>
> Which we then overwrite at runtime with "pushf; pop %[re]ax" and a few
> NOPs.
>
> Now, objtool understands alternatives, and ensures they have the same
> stack layout, it has no chance in hell of understanding this, simply
> because paravirt_patch.c is magic.
>
> I don't have any immediate clever ideas, but let me ponder it a wee bit.
>
> ....
>
> Something really disguisting we could do is recognise the indirect call
> offset and emit an extra ORC entry for RIP+1. So the cases are:
>
> CALL *pv_ops.save_fl -- 7 bytes IIRC
> CALL $imm; -- 5 bytes
> PUSHF; POP %[RE]AX -- 2 bytes
>
> so the RIP+1 (the POP insn) will only ever exist in this case. The
> indirect and direct call cases would never land on that IP.
I had a similar idea, and a bit of deja vu - we may have talked about
this before. At least I know we talked about doing something similar
for alternatives which muck with the stack.
> > It looks like 044d0d6de9f5 ("lockdep: Only trace IRQ edges") is making
> > the problem more likely, by adding the irqs_disabled() check for every
> > local_irq_disable().
> >
> > Also - Peter, Nicholas - is that irqs_disabled() check really necessary
> > in local_irq_disable()? Presumably irqs would typically be be enabled
> > before calling it?
>
> Yeah, so it's all a giant can of worms that; also see:
>
> https://lkml.kernel.org/r/20200821084738.508092956@infradead.org
>
> The basic idea is to only trace edges, ie. when the hardware state
> actually changes. Sadly this means doing a pushf/pop before the cli.
> Ideally CLI would store the old IF in CF or something like that, but
> alas.
Right, that makes sense for save/restore, but is the disabled check
really needed for local_irq_disable()? Wouldn't that always be an edge?
And anyway I don't see a similar check for local_irq_enable().
--
Josh
Powered by blists - more mailing lists