linux-kernel - Re: [RFC PATCH 2/3] Fix: sched/membarrier: READ_ONCE p->mm in membarrier_global

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <CAHk-=wiY-Lh1SvzpG2OPh_DUNJMeTTSyfNmx=ALz1UuZ4EiC=g@mail.gmail.com>
Date:   Tue, 3 Sep 2019 09:23:47 -0700
From:   Linus Torvalds <torvalds@...ux-foundation.org>
To:     Mathieu Desnoyers <mathieu.desnoyers@...icios.com>
Cc:     Peter Zijlstra <peterz@...radead.org>,
        Oleg Nesterov <oleg@...hat.com>,
        "Eric W. Biederman" <ebiederm@...ssion.com>,
        Russell King - ARM Linux admin <linux@...linux.org.uk>,
        Chris Metcalf <cmetcalf@...hip.com>,
        Christoph Lameter <cl@...ux.com>,
        Kirill Tkhai <tkhai@...dex.ru>, Mike Galbraith <efault@....de>,
        Thomas Gleixner <tglx@...utronix.de>,
        Ingo Molnar <mingo@...nel.org>,
        Linux List Kernel Mailing <linux-kernel@...r.kernel.org>
Subject: Re: [RFC PATCH 2/3] Fix: sched/membarrier: READ_ONCE p->mm in membarrier_global_expedited

On Tue, Sep 3, 2019 at 9:00 AM Mathieu Desnoyers
<mathieu.desnoyers@...icios.com> wrote:
>
> Due to the lack of READ_ONCE() on p->mm, this code can in fact turn into
> a NULL deref when we hit do_exit() around exit_mm(). The first p->mm
> read is before and sees !NULL, the second is after and does observe
> NULL, which triggers a null pointer dereference.

This is horribly ugly, and I don't think it is necessary.

The way to fix the problem you are addressing in patches 2-3 is to
move the MEMBARRIER_STATE_GLOBAL_EXPEDITED flag from the mm struct to
the task struct, and avoiding the whole issue with "mm may be released
at any point" that way.

Now, your reaction will be "but lots of threads can share an 'mm', so
we can't do that", but that doesn't seem to be true. Looking at the
place that _sets_ this, you already handle the single-thread cases
specially, and the multiple threads has to do a "synchronize_rcu()".
You might as well either walk the current CPU's and set it in all
threads where the thread->mm matches the mm. And then you make the
scheduler set the bit on newly scheduled entities.

NOTE! When you walk all current cpu's in
membarrier_register_global_expedited(), you only look at the
'task->mm' _value_, you don't dereference it. And that's ok, because
'task' itself is stable, it's just mm that can go away.

Wouldn't that solve the issue much more cleanly?

             Linus