lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <1263926259.4283.757.camel@laptop>
Date:	Tue, 19 Jan 2010 19:37:39 +0100
From:	Peter Zijlstra <peterz@...radead.org>
To:	Mathieu Desnoyers <mathieu.desnoyers@...ymtl.ca>
Cc:	Steven Rostedt <rostedt@...dmis.org>, linux-kernel@...r.kernel.org,
	"Paul E. McKenney" <paulmck@...ux.vnet.ibm.com>,
	Oleg Nesterov <oleg@...hat.com>, Ingo Molnar <mingo@...e.hu>,
	akpm@...ux-foundation.org, josh@...htriplett.org,
	tglx@...utronix.de, Valdis.Kletnieks@...edu, dhowells@...hat.com,
	laijs@...fujitsu.com, dipankar@...ibm.com
Subject: Re: [RFC PATCH] introduce sys_membarrier(): process-wide memory
 barrier (v5)

On Thu, 2010-01-14 at 14:33 -0500, Mathieu Desnoyers wrote:
> It's a case where CPU 1 switches from our mm to another mm:
> 
>        CPU 0 (membarrier)                  CPU 1 (another mm -our mm)
>        <user-space>                        <user-space>
>                                            <buffered access C.S. data>
>                                            urcu read unlock()
>                                              barrier()
>                                              store local gp
>                                            <kernel-space>

OK, so the question is how we end up here, if its though interrupt
preemption I think the interrupt delivery will imply an mb, if its a
blocking syscall, the set_task_state() mb [*] should be there.

Then we also do:

					clear_tsk_need_resched()

which is an atomic bitop (although does not imply a full barrier
per-se).

>                                            rq->curr = next (1)
>        memory access before membarrier
>        <call sys_membarrier()>
>        smp_mb()
>        mm_cpumask includes CPU 1
>        rcu_read_lock()
>        if (cpu_curr(1)->mm != our mm)
>          skip CPU 1     -> here, rq->curr new version is already visible
>        rcu_read_unlock()
>        smp_mb()
>        <return to user-space>
>        memory access after membarrier
>        -> this is where we allow freeing
>           the old structure although the
>           buffered access C.S. data is
>           still in flight.
>                                            User-space access C.S. data (2)
>                                              (buffer flush)
>                                            switch_mm()
>                                              smp_mb()
>                                              clear_mm_cpumask()
>                                              set_mm_cpumask()
>                                              smp_mb() (by load_cr3() on x86)
>                                            switch_to()
>                                              <buffered current = next>
>                                            <switch back to user-space>
>                                              current = next (1) (buffer flush)
>                                            access critical section data (3)
> 
> As we can see, the reordering of (1) and (2) is problematic, as it lets
> the check skip over a CPU that have global side-effects not committed to
> memory yet. 

Right, this one I get, thanks!


So about that [*], Oleg, kernel/signal.c:SYSCALL_DEFINE0(pause) does:

SYSCALL_DEFINE0(pause)
{
        current->state = TASK_INTERRUPTIBLE;
        schedule();
        return -ERESTARTNOHAND;
}

Isn't that ->state assignment buggy? If so, there appear to be quite a
few such sites, which worries me.

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ