linux-kernel - Memory barrier needed with wake_up

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [thread-next>] [day] [month] [year] [list]

Message-ID: <Pine.LNX.4.44L0.1609021341220.2027-100000@iolanthe.rowland.org>
Date:   Fri, 2 Sep 2016 14:10:13 -0400 (EDT)
From:   Alan Stern <stern@...land.harvard.edu>
To:     "Paul E. McKenney" <paulmck@...ux.vnet.ibm.com>,
        Peter Zijlstra <peterz@...radead.org>,
        Ingo Molnar <mingo@...hat.com>
cc:     Felipe Balbi <felipe.balbi@...ux.intel.com>,
        USB list <linux-usb@...r.kernel.org>,
        Kernel development list <linux-kernel@...r.kernel.org>
Subject: Memory barrier needed with wake_up_process()?

Paul, Peter, and Ingo:

This must have come up before, but I don't know what was decided.

Isn't it often true that a memory barrier is needed before a call to 
wake_up_process()?  A typical scenario might look like this:

	CPU 0
	-----
	for (;;) {
		set_current_state(TASK_INTERRUPTIBLE);
		if (signal_pending(current))
			break;
		if (wakeup_flag)
			break;
		schedule();
	}
	__set_current_state(TASK_RUNNING);
	wakeup_flag = 0;

	CPU 1
	-----
	wakeup_flag = 1;
	wake_up_process(my_task);

The underlying pattern is:

	CPU 0				CPU 1
	-----				-----
	write current->state		write wakeup_flag
	smp_mb();
	read wakeup_flag		read my_task->state

where set_current_state() does the write to current->state and 
automatically adds the smp_mb(), and wake_up_process() reads 
my_task->state to see whether the task needs to be woken up.

The kerneldoc for wake_up_process() says that it has no implied memory
barrier if it doesn't actually wake anything up.  And even when it
does, the implied barrier is only smp_wmb, not smp_mb.

This is the so-called SB (Store Buffer) pattern, which is well known to
require a full smp_mb on both sides.  Since wake_up_process() doesn't
include smp_mb(), isn't it correct that the caller must add it
explicitly?

In other words, shouldn't the code for CPU 1 really be:

	wakeup_flag = 1;
	smp_mb();
	wake_up_process(task);

If my reasoning is correct, then why doesn't wake_up_process() include 
this memory barrier automatically, the way set_current_state() does?  
There could be an alternate version (__wake_up_process()) which omits 
the barrier, just like __set_current_state().

Alan Stern