[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20190507095103.GP2606@hirez.programming.kicks-ass.net>
Date: Tue, 7 May 2019 11:51:03 +0200
From: Peter Zijlstra <peterz@...radead.org>
To: Linus Torvalds <torvalds@...ux-foundation.org>
Cc: Steven Rostedt <rostedt@...dmis.org>,
Andy Lutomirski <luto@...capital.net>,
Linux List Kernel Mailing <linux-kernel@...r.kernel.org>,
Ingo Molnar <mingo@...nel.org>,
Andrew Morton <akpm@...ux-foundation.org>,
Andy Lutomirski <luto@...nel.org>,
Nicolai Stange <nstange@...e.de>,
Thomas Gleixner <tglx@...utronix.de>,
Ingo Molnar <mingo@...hat.com>, Borislav Petkov <bp@...en8.de>,
"H. Peter Anvin" <hpa@...or.com>,
the arch/x86 maintainers <x86@...nel.org>,
Josh Poimboeuf <jpoimboe@...hat.com>,
Jiri Kosina <jikos@...nel.org>,
Miroslav Benes <mbenes@...e.cz>,
Petr Mladek <pmladek@...e.com>,
Joe Lawrence <joe.lawrence@...hat.com>,
Shuah Khan <shuah@...nel.org>,
Konrad Rzeszutek Wilk <konrad.wilk@...cle.com>,
Tim Chen <tim.c.chen@...ux.intel.com>,
Sebastian Andrzej Siewior <bigeasy@...utronix.de>,
Mimi Zohar <zohar@...ux.ibm.com>,
Juergen Gross <jgross@...e.com>,
Nick Desaulniers <ndesaulniers@...gle.com>,
Nayna Jain <nayna@...ux.ibm.com>,
Masahiro Yamada <yamada.masahiro@...ionext.com>,
Joerg Roedel <jroedel@...e.de>,
"open list:KERNEL SELFTEST FRAMEWORK"
<linux-kselftest@...r.kernel.org>, stable <stable@...r.kernel.org>,
Masami Hiramatsu <mhiramat@...nel.org>
Subject: Re: [RFC][PATCH 1/2] x86: Allow breakpoints to emulate call functions
On Mon, May 06, 2019 at 07:22:06PM -0700, Linus Torvalds wrote:
> We do *not* have very strict guarantees for D$-vs-I$ coherency on x86,
> but we *do* have very strict guarantees for D$-vs-D$ coherency. And so
> we could use the D$ coherency to give us atomicity guarantees for
> loading and storing the instruction offset for instruction emulation,
> in ways we can *not* use the D$-to-I$ guarantees and just executing it
> directly.
>
> So while we still need those nasty IPI's to guarantee the D$-vs-I$
> coherency in the "big picture" model and to get the serialization with
> the actual 'int3' exception right, we *could* just do all the other
> parts of the instruction emulation using the D$ coherency.
>
> So we could do the actual "call offset" write with a single atomic
> 4-byte locked cycle (just use "xchg" to write - it's always locked).
> And similarly we could do the call offset *read* with a single locked
> cycle (cmpxchg with a 0 value, for example). It would be atomic even
> if it crosses a cacheline boundary.
Very 'soon', x86 will start to #AC if you do unaligned LOCK prefixed
instructions. The problem is that while aligned LOCK instructions can do
the atomicity with the coherency protocol, unaligned (esp, line or page
boundary crossing ones) needs that bus-lock thing the SDM talks about.
For giggles, write yourself a while(1) loop that XCHGs across a
page-boundary and see what it does to the rest of the system.
So _please_, do not rely on unaligned atomic ops. We really want them to
do the way of the Dodo.
Powered by blists - more mailing lists