linux-kernel - Re: [PATCH v2 -tip] x86/percpu: Use C for arch_raw_cpu

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <CAFULd4bLEU-tBC8dO1wf66UAxQ2d1HxQ=D6wvtHZfdQCKhnpkw@mail.gmail.com>
Date:   Wed, 18 Oct 2023 09:46:26 +0200
From:   Uros Bizjak <ubizjak@...il.com>
To:     Linus Torvalds <torvalds@...ux-foundation.org>
Cc:     Nadav Amit <namit@...are.com>,
        "the arch/x86 maintainers" <x86@...nel.org>,
        Linux Kernel Mailing List <linux-kernel@...r.kernel.org>,
        Andy Lutomirski <luto@...nel.org>,
        Brian Gerst <brgerst@...il.com>,
        Denys Vlasenko <dvlasenk@...hat.com>,
        "H . Peter Anvin" <hpa@...or.com>,
        Peter Zijlstra <peterz@...radead.org>,
        Thomas Gleixner <tglx@...utronix.de>,
        Josh Poimboeuf <jpoimboe@...hat.com>,
        Nick Desaulniers <ndesaulniers@...gle.com>
Subject: Re: [PATCH v2 -tip] x86/percpu: Use C for arch_raw_cpu_ptr()

On Tue, Oct 17, 2023 at 11:53 PM Linus Torvalds
<torvalds@...ux-foundation.org> wrote:
>
> On Tue, 17 Oct 2023 at 14:06, Uros Bizjak <ubizjak@...il.com> wrote:
> >
> > But adding the attached patch on top of both patches boots OK.
>
> Funky.
>
> Mind adding a
>
>         WARN_ON_ONCE(!active_mm);
>
> to there to give a nice backtrace for the odd NULL case.

[    4.907840] Call Trace:
[    4.908909]  <TASK>
[    4.909858]  ? __warn+0x7b/0x120
[    4.911108]  ? begin_new_exec+0x90f/0xa30
[    4.912602]  ? report_bug+0x164/0x190
[    4.913929]  ? handle_bug+0x3c/0x70
[    4.915179]  ? exc_invalid_op+0x17/0x70
[    4.916569]  ? asm_exc_invalid_op+0x1a/0x20
[    4.917969]  ? begin_new_exec+0x90f/0xa30
[    4.919303]  ? begin_new_exec+0x3ce/0xa30
[    4.920667]  ? load_elf_phdrs+0x67/0xb0
[    4.921935]  load_elf_binary+0x2bb/0x1770
[    4.923262]  ? __kernel_read+0x136/0x2d0
[    4.924563]  bprm_execve+0x277/0x630
[    4.925703]  kernel_execve+0x145/0x1a0
[    4.926890]  call_usermodehelper_exec_async+0xcb/0x180
[    4.928408]  ? __pfx_call_usermodehelper_exec_async+0x10/0x10
[    4.930515]  ret_from_fork+0x2f/0x50
[    4.931894]  ? __pfx_call_usermodehelper_exec_async+0x10/0x10
[    4.933941]  ret_from_fork_asm+0x1b/0x30
[    4.935371]  </TASK>
[    4.936212] ---[ end trace 0000000000000000 ]---

>
> That code *is* related to 'current', in how we do
>
>         tsk = current;
> ...
>         local_irq_disable();
>         active_mm = tsk->active_mm;
>         tsk->active_mm = mm;
>         tsk->mm = mm;
> ...
>         activate_mm(active_mm, mm);
> ...
>         mmdrop_lazy_tlb(active_mm);
>
> but I don't see how 'active_mm' could *poossibly* be validly NULL
> here, and why caching 'current' would matter and change it.

I have also added "__attribute__((optimize(0)))" to exec_mmap() to
weed out compiler bugs. The result was the same oops in
mmdrop_lazy_tlb.

Also, when using WARN_ON instead of WARN_ON_ONCE, it triggers only
once during the whole boot, with the above trace.

Another observation: adding WARN_ON to the top of exec_mmap:

    WARN_ON(!current->active_mm);
    /* Notify parent that we're no longer interested in the old VM */
    tsk = current;
    old_mm = current->mm;

also triggers WARN, suggesting that current does not have active_mm
set on the entry to the function.

Uros.

> Strange.
>
> Hmm. We do set
>
>         tsk->active_mm = NULL;
>
> in copy_mm(), and then we have that odd kernel thread case:
>
>         /*
>          * Are we cloning a kernel thread?
>          *
>          * We need to steal a active VM for that..
>          */
>         oldmm = current->mm;
>         if (!oldmm)
>                 return 0;
>
> but none of this should even matter, because by the time we actually
> *schedule* that thread, we'll set active_mm to the right thing.
>
> Can anybody see what's up?
>
>              Linus