[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <CAHk-=wgSsfo89ESHcngvPCkQSh_YAJG-0g7fupb+Uv0E1d_EcQ@mail.gmail.com>
Date: Mon, 16 Oct 2023 12:24:37 -0700
From: Linus Torvalds <torvalds@...ux-foundation.org>
To: Uros Bizjak <ubizjak@...il.com>
Cc: Nadav Amit <namit@...are.com>,
"the arch/x86 maintainers" <x86@...nel.org>,
Linux Kernel Mailing List <linux-kernel@...r.kernel.org>,
Andy Lutomirski <luto@...nel.org>,
Brian Gerst <brgerst@...il.com>,
Denys Vlasenko <dvlasenk@...hat.com>,
"H . Peter Anvin" <hpa@...or.com>,
Peter Zijlstra <peterz@...radead.org>,
Thomas Gleixner <tglx@...utronix.de>,
Josh Poimboeuf <jpoimboe@...hat.com>,
Nick Desaulniers <ndesaulniers@...gle.com>
Subject: Re: [PATCH v2 -tip] x86/percpu: Use C for arch_raw_cpu_ptr()
On Mon, 16 Oct 2023 at 11:53, Uros Bizjak <ubizjak@...il.com> wrote:
>
> Unfortunately, it does not work and dies early in the boot with:
Side note: build the kernel with debug info (the limited form is
sufficient), and then run oopses through
./scripts/decode_stacktrace.sh
to get much nicer oops information that has line numbers and inlining
information in the backtrace.
> [ 4.939358] BUG: kernel NULL pointer dereference, address: 0000000000000000
> [ 4.940090] RIP: 0010:begin_new_exec+0x8f2/0xa30
> [ 4.940090] Code: 31 f6 e8 c1 49 f9 ff e9 3c fa ff ff 31 f6 4c 89
> ef e8 b2 4a f9 ff e9 19 fa ff ff 31 f6 4c 89 ef e8 23 4a f9 ff e9 ea
> fa ff ff <f0> 41 ff 0c 24 0f
> 85 55 fb ff ff 4c 89 e7 e8 4b 02 df ff e9 48 fb
That decodes to
0: 31 f6 xor %esi,%esi
2: e8 c1 49 f9 ff call 0xfffffffffff949c8
7: e9 3c fa ff ff jmp 0xfffffffffffffa48
c: 31 f6 xor %esi,%esi
e: 4c 89 ef mov %r13,%rdi
11: e8 b2 4a f9 ff call 0xfffffffffff94ac8
16: e9 19 fa ff ff jmp 0xfffffffffffffa34
1b: 31 f6 xor %esi,%esi
1d: 4c 89 ef mov %r13,%rdi
20: e8 23 4a f9 ff call 0xfffffffffff94a48
25: e9 ea fa ff ff jmp 0xfffffffffffffb14
2a:* f0 41 ff 0c 24 lock decl (%r12) <-- trapping instruction
2f: 0f 85 55 fb ff ff jne 0xfffffffffffffb8a
35: 4c 89 e7 mov %r12,%rdi
38: e8 4b 02 df ff call 0xffffffffffdf0288
but without a nicer backtrace it's nasty to guess where this is.
The "lock decl ; jne" is a good hint, though - that sequence is most
definitely "atomic_dec_and_test()".
And that in turn means that it's almost certainly mmdrop(), which is
if (unlikely(atomic_dec_and_test(&mm->mm_count)))
__mmdrop(mm);
where that
35: 4c 89 e7 mov %r12,%rdi
38: e8 4b 02 df ff call 0xffffffffffdf0288
is exactly the unlikely "__mmdrop(mm)" part (and gcc decided to make
the likely branch a branch-out for some reason - presumably with the
inlining the code around it meant that was the better layout - maybe
this was all inside another "unlikely()" branch.
And if I read that right, this has all been inlined from
begin_new_exec() -> exec_mmap() -> mmdrop_lazy_tlb().
Now, how and why 'mm' would be NULL in that path, and why any
'current' reloading optimization would matter in this all I very much
can't see. The call site in begin_new_exec() is
/*
* Release all of the old mmap stuff
*/
acct_arg_size(bprm, 0);
retval = exec_mmap(bprm->mm);
if (retval)
goto out;
bprm->mm = NULL;
and "bprm->mm" is most definitely non-NULL there because we earlier did
So I suspect the problem happened much earlier, caused some nasty
internal corruption, and the odd 'mm is NULL' is just a symptom.
retval = set_mm_exe_file(bprm->mm, bprm->file);
using it, and that would have oopsed had bprm->mm been NULL then.
So there's some serious corruption there, but from the oops itself I
can't tell the source. I guess if we get 'current' wrong anywhere, all
bets are off.
Linus
Powered by blists - more mailing lists