lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <1265047203.29013.69.camel@gandalf.stny.rr.com>
Date:	Mon, 01 Feb 2010 13:00:03 -0500
From:	Steven Rostedt <rostedt@...dmis.org>
To:	Mathieu Desnoyers <mathieu.desnoyers@...ymtl.ca>
Cc:	Linus Torvalds <torvalds@...ux-foundation.org>,
	akpm@...ux-foundation.org, Ingo Molnar <mingo@...e.hu>,
	linux-kernel@...r.kernel.org,
	KOSAKI Motohiro <kosaki.motohiro@...fujitsu.com>,
	"Paul E. McKenney" <paulmck@...ux.vnet.ibm.com>,
	Nicholas Miell <nmiell@...cast.net>, laijs@...fujitsu.com,
	dipankar@...ibm.com, josh@...htriplett.org, dvhltc@...ibm.com,
	niv@...ibm.com, tglx@...utronix.de, peterz@...radead.org,
	Valdis.Kletnieks@...edu, dhowells@...hat.com
Subject: Re: [patch 2/3] scheduler: add full memory barriers upon task
 switch at runqueue lock/unlock

On Mon, 2010-02-01 at 12:45 -0500, Mathieu Desnoyers wrote:
> * Linus Torvalds (torvalds@...ux-foundation.org) wrote:

> It's probably worthwhile to summarize here the mb-related discussions we
> had in the previous RFC rounds.

Actually it is probably worthwhile to include this explanation in the
change log. Can't expect everyone to have read a kernel thread that
contains hundreds of replies.

> 
> Here is the detailed execution scenario showing the race. The race with
> unlock showing that we need to order user-space loads before mm_cpumask
> store. This is showing an execution sequence involving the userspace RCU
> library and the Linux kernel with sys_membarrier(). As we see below, the
> lack of ordering between "cpumask_clear" and user-space execution
> creates a race with the scheduler where sys_membarrier() incorrectly
> considers CPU 1 as not needing a membarrier_ipi.
> 
> 
>         CPU 0                   CPU 1
>         --------------          --------------
>         <user space>            <user space> (already in read-side C.S.)
>                                 obj = rcu_dereference(list->next);
>                                   -> load list->next
>                                 copy = obj->foo;
>                                 rcu_read_unlock();
>                                   -> store to urcu_reader.ctr
>                                 <urcu_reader.ctr store is globally visible>
>         list_del(obj);
>                                 <preempt>
>                                 <kernel space>
> 
>                                 schedule();
>                                 cpumask_clear(mm_cpumask, cpu);
> 
>         sys_membarrier();
>           <kernel space>
>           smp_mb()
>           iterates on mm_cpumask
>           smp_mb()
>         <user space>
>         set global g.p. (urcu_gp_ctr) phase to 1
>         wait for all urcu_reader.ctr in phase 0
>         set global g.p. (urcu_gp_ctr) phase to 0
>         wait for all urcu_reader.ctr in phase 1
>         sys_membarrier();
>           <kernel space>
>           smp_mb()
>           iterates on mm_cpumask
>           smp_mb()
>         <user space>
>         free(obj);
>                                 <list->next load hits memory>
>                                 <obj->foo load hits memory> <- corruption
> 
> 
> There is a similar race scenario (symmetric to the one above) between
> cpumask_set and a following user-space rcu_read_lock(), which I could
> provide upon request.

If it would help others to understand, it would be worthwhile in
explaining that too. So, I'm requesting it ;-)

> 
> This results in the following comments around smp_mb() in
> sys_membarrier:
> 
> First sys_membarrier smp_mb():
> 
>         /*
>          * Memory barrier on the caller thread before sending first
>          * IPI. Orders all memory accesses prior to sys_membarrier() before
>          * mm_cpumask read and membarrier_ipi executions.
>          * Matches memory barriers before and after mm_cpumask updates.
>          */

The above comment does not really explain what it is protecting. It just
states that it serializes memory access. Well, duh, that's what smp_mb()
does ;-)  It's the equivalent to i++; /* increment i */

Basically, the comment should explain "why" the memory barrier is
needed.

> 
> Second sys_membarrier smp_mb():
> 
>         /*
>          * Memory barrier on the caller thread after we finished
>          * waiting for the last IPI. Orders mm_cpumask read and membarrier_ipi
>          * executions before memory accesses following sys_membarrier()
>          * execution.
>          * Matches memory barriers before and after mm_cpumask updates.
>          */

Ditto.

-- Steve


--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ