[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <CAHk-=whq=+LNHmsde8LaF4pdvKxqKt5GxW+Tq+U35_aDcV0ADg@mail.gmail.com>
Date: Sun, 8 Oct 2023 13:13:54 -0700
From: Linus Torvalds <torvalds@...ux-foundation.org>
To: Uros Bizjak <ubizjak@...il.com>
Cc: x86@...nel.org, linux-kernel@...r.kernel.org,
Andy Lutomirski <luto@...nel.org>,
Ingo Molnar <mingo@...nel.org>, Nadav Amit <namit@...are.com>,
Brian Gerst <brgerst@...il.com>,
Denys Vlasenko <dvlasenk@...hat.com>,
"H . Peter Anvin" <hpa@...or.com>,
Peter Zijlstra <peterz@...radead.org>,
Thomas Gleixner <tglx@...utronix.de>,
Borislav Petkov <bp@...en8.de>,
Josh Poimboeuf <jpoimboe@...hat.com>
Subject: Re: [PATCH 4/4] x86/percpu: Use C for percpu read/write accessors
On Sun, 8 Oct 2023 at 12:18, Uros Bizjak <ubizjak@...il.com> wrote:
>
> Let me see what happens here. I have changed *only* raw_cpu_read_8,
> but the GP fault is reported in cpu_init_exception_handling, which
> uses this_cpu_ptr. Please note that all per-cpu initializations go
> through existing {this|raw}_cpu_write.
I think it's an ordering issue, and I think you may hit some issue
with loading TR od the GDT or whatever.
For example, we have this
set_tss_desc(cpu, &get_cpu_entry_area(cpu)->tss.x86_tss);
followed by
asm volatile("ltr %w0"::"q" (GDT_ENTRY_TSS*8));
in native_load_tr_desc(), and I think we might want to add a "memory"
clobber to it to make sure it is serialized with any stores to the GDT
entries in question.
I don't think *that* particular thing is the issue (because you kept
the writes as-is and still hit things), but I think it's an example of
some lazy inline asm constraints that could possibly cause problems if
the ordering changes.
And yes, this code ends up depending on things like
CONFIG_PARAVIRT_XXL for whether it uses the native TR loading or uses
some paravirt version, so config options can make a difference.
Again: I don't think it's that "ltr" instruction. I'm calling it out
just as a "that function does some funky things", and the load TR is
*one* of the funky things, and it looks like it could be the same type
of thing that then causes issues.
Things like CONFIG_SMP might also matter, because the percpu setup is
different. On UP, the *segment* use goes away, but I think the whole
"use inline asm vs regular memory ops" remains (admittedly I did *not*
verify that, I might be speaking out of my *ss).
Your dump does end up being close to a %gs access:
0: 4a 03 04 ed 40 19 15 add -0x7aeae6c0(,%r13,8),%rax
7: 85
8: 48 89 c7 mov %rax,%rdi
b: e8 9c bb ff ff call 0xffffffffffffbbac
10: 48 c7 c0 10 73 02 00 mov $0x27310,%rax
17: 48 ba 00 00 00 00 00 movabs $0xdffffc0000000000,%rdx
1e: fc ff df
21: 48 c1 e8 03 shr $0x3,%rax
25:* 80 3c 10 00 cmpb $0x0,(%rax,%rdx,1) <-- trapping instruction
29: 0f 85 21 05 00 00 jne 0x550
2f: 65 48 8b 05 45 26 f6 mov %gs:0x7ef62645(%rip),%rax # 0x7ef6267c
36: 7e
37: 48 8d 7b 24 lea 0x24(%rbx),%rdi
but I don't know what the "call" before is, so I wasn't able to match
it up with any obvious code in there.
Linus
Powered by blists - more mailing lists