[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20181129121307.12393c57@gandalf.local.home>
Date: Thu, 29 Nov 2018 12:13:07 -0500
From: Steven Rostedt <rostedt@...dmis.org>
To: Andy Lutomirski <luto@...capital.net>
Cc: Linus Torvalds <torvalds@...ux-foundation.org>,
Josh Poimboeuf <jpoimboe@...hat.com>,
Peter Zijlstra <peterz@...radead.org>,
Andrew Lutomirski <luto@...nel.org>,
the arch/x86 maintainers <x86@...nel.org>,
Linux List Kernel Mailing <linux-kernel@...r.kernel.org>,
Ard Biesheuvel <ard.biesheuvel@...aro.org>,
Ingo Molnar <mingo@...nel.org>,
Thomas Gleixner <tglx@...utronix.de>, mhiramat@...nel.org,
jbaron@...mai.com, Jiri Kosina <jkosina@...e.cz>,
David.Laight@...lab.com, bp@...en8.de, julia@...com,
jeyu@...nel.org, Peter Anvin <hpa@...or.com>
Subject: Re: [PATCH v2 4/4] x86/static_call: Add inline static call
implementation for x86-64
On Thu, 29 Nov 2018 09:02:23 -0800
Andy Lutomirski <luto@...capital.net> wrote:
> > Instead, I'd suggest:
> >
> > - just restart the instruction (with the suggested "ptregs->rip --")
> >
> > - to avoid any "oh, we're not making progress" issues, just fix the
> > instruction yourself to be the right call, by looking it up in the
> > "what needs to be fixed" tables.
> >
> > No?
>
> I thought that too. I think it deadlocks. CPU A does
> text_poke_bp(). CPU B is waiting for a spinlock with IRQs off. CPU
> C holds the spinlock and hits the int3. The int3 never goes away
> because CPU A is waiting for CPU B to handle the sync_core IPI.
I agree that this can happen.
>
> Or do you think we can avoid the IPI while the int3 is there?
No, we really do need to sync after we change the second part of the
command with the int3 on it. Unless there's another way to guarantee
that the full instruction gets seen when we replace the int3 with the
finished command.
To refresh everyone's memory for why we have an IPI (as IPIs have an
implicit memory barrier for the CPU).
We start with:
e8 01 02 03 04
and we want to convert it to: e8 ab cd ef 01
And let's say the instruction crosses a cache line that breaks it into
e8 01 and 02 03 04.
We add the breakpoint:
cc 01 02 03 04
We do a sync (so now everyone should see the break point), because we
don't want to update the second part and another CPU happens to update
the second part of the cache, and might see:
e8 01 cd ef 01
Which would not be good.
And we need another sync after we change the code so all CPUs see
cc ab cd ef 01
Because when we remove the break point, we don't want other CPUs to see
e8 ab 02 03 04
Which would also be bad.
-- Steve
Powered by blists - more mailing lists