lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <1604807537.1565.1567610340030.JavaMail.zimbra@efficios.com>
Date:   Wed, 4 Sep 2019 11:19:00 -0400 (EDT)
From:   Mathieu Desnoyers <mathieu.desnoyers@...icios.com>
To:     Linus Torvalds <torvalds@...ux-foundation.org>
Cc:     Peter Zijlstra <peterz@...radead.org>,
        paulmck <paulmck@...ux.ibm.com>,
        linux-kernel <linux-kernel@...r.kernel.org>,
        Oleg Nesterov <oleg@...hat.com>,
        "Eric W. Biederman" <ebiederm@...ssion.com>,
        "Russell King, ARM Linux" <linux@...linux.org.uk>,
        Chris Metcalf <cmetcalf@...hip.com>,
        Chris Lameter <cl@...ux.com>, Kirill Tkhai <tkhai@...dex.ru>,
        Mike Galbraith <efault@....de>,
        Thomas Gleixner <tglx@...utronix.de>,
        Ingo Molnar <mingo@...nel.org>
Subject: Re: [RFC PATCH 1/2] Fix: sched/membarrier: p->mm->membarrier_state
 racy load

----- On Sep 3, 2019, at 4:36 PM, Linus Torvalds torvalds@...ux-foundation.org wrote:

> On Tue, Sep 3, 2019 at 1:25 PM Peter Zijlstra <peterz@...radead.org> wrote:
>>
>> Why can't we frob this state into a line/word we already have to
>> unconditionally touch, like the thread_info::flags word for example.
> 
> I agree, but we don't have any easily used flags left, I think.
> 
> But yes, it would be better to not have membarrier always dirty
> another cacheline in the scheduler. So instead of
> 
>        atomic_set(&t->membarrier_state,
>                   atomic_read(&t->mm->membarrier_state));
> 
> it migth be better to do something like
> 
>        if (mm->membarrier_state)
>                atomic_or(&t->membarrier_state, mm->membarrier_state);
> 
> or something along those lines - I think we've already brought in the
> 'mm' struct into the cache anyway, and we'd not do the write (and
> dirty the destination cacheline) for the common case of no membarrier
> usage.
> 
> But yes, it would be better still if we can re-use some already dirty
> cache state.

Considering the alternative proposed by PeterZ, which is to iterate over
all processes/threads from an unprivileged process, I would be tempted
to put some more thoughts into the mm->membarrier_state cache-line. Do
we expect it to be typically hot ? Is there anything we can do to move
this field into a typically hot mm cacheline ?

I agree with your approach aiming to typically just load that field
(no store in the common case).

> 
> I wonder if the easiest model might be to just use a percpu variable
> instead for the membarrier stuff? It's not like it has to be in
> 'struct task_struct' at all, I think. We only care about the current
> runqueues, and those are percpu anyway.

One issue here is that membarrier iterates over all runqueues without
grabbing any runqueue lock. If we copy that state from mm to rq on
sched switch prepare, we would need to ensure we have the proper
memory barriers between:

prior user-space memory accesses  /  setting the runqueue membarrier state

and

setting the runqueue membarrier state / following user-space memory accesses

Copying the membarrier state into the task struct leverages the fact that
we have documented and guaranteed those barriers around the rq->curr update
in the scheduler.

Thanks,

Mathieu

-- 
Mathieu Desnoyers
EfficiOS Inc.
http://www.efficios.com

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ