linux-kernel - Re: [RFC PATCH] introduce sys_membarrier(): process-wide memory barrier

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <20100107052851.GA12419@feather>
Date:	Wed, 6 Jan 2010 21:28:52 -0800
From:	Josh Triplett <josh@...htriplett.org>
To:	Mathieu Desnoyers <mathieu.desnoyers@...ymtl.ca>
Cc:	linux-kernel@...r.kernel.org,
	"Paul E. McKenney" <paulmck@...ux.vnet.ibm.com>,
	Ingo Molnar <mingo@...e.hu>, akpm@...ux-foundation.org,
	tglx@...utronix.de, peterz@...radead.org, rostedt@...dmis.org,
	Valdis.Kletnieks@...edu, dhowells@...hat.com, laijs@...fujitsu.com,
	dipankar@...ibm.com
Subject: Re: [RFC PATCH] introduce sys_membarrier(): process-wide memory
 barrier

On Wed, Jan 06, 2010 at 11:40:07PM -0500, Mathieu Desnoyers wrote:
> Here is an implementation of a new system call, sys_membarrier(), which
> executes a memory barrier on all threads of the current process.
> 
> It aims at greatly simplifying and enhancing the current signal-based
> liburcu userspace RCU synchronize_rcu() implementation.
> (found at http://lttng.org/urcu)
> 
> Both the signal-based and the sys_membarrier userspace RCU schemes
> permit us to remove the memory barrier from the userspace RCU
> rcu_read_lock() and rcu_read_unlock() primitives, thus significantly
> accelerating them. These memory barriers are replaced by compiler
> barriers on the read-side, and all matching memory barriers on the
> write-side are turned into an invokation of a memory barrier on all
> active threads in the process. By letting the kernel perform this
> synchronization rather than dumbly sending a signal to every process
> threads (as we currently do), we diminish the number of unnecessary wake
> ups and only issue the memory barriers on active threads. Non-running
> threads do not need to execute such barrier anyway, because these are
> implied by the scheduler context switches.
[...]
> The current implementation simply executes a memory barrier in an IPI
> handler on each active cpu. Going through the hassle of taking run queue
> locks and checking if the thread running on each online CPU belongs to
> the current thread seems more heavyweight than the cost of the IPI
> itself (not measured though).

> --- linux-2.6-lttng.orig/kernel/sched.c	2010-01-06 22:11:32.000000000 -0500
> +++ linux-2.6-lttng/kernel/sched.c	2010-01-06 23:20:42.000000000 -0500
> @@ -10822,6 +10822,36 @@ struct cgroup_subsys cpuacct_subsys = {
>  };
>  #endif	/* CONFIG_CGROUP_CPUACCT */
>  
> +/*
> + * Execute a memory barrier on all CPUs on SMP systems.
> + * Do not rely on implicit barriers in smp_call_function(), just in case they
> + * are ever relaxed in the future.
> + */
> +static void membarrier_ipi(void *unused)
> +{
> +	smp_mb();
> +}
> +
> +/*
> + * sys_membarrier - issue memory barrier on current process running threads
> + *
> + * Execute a memory barrier on all running threads of the current process.
> + * Upon completion, the caller thread is ensured that all process threads
> + * have passed through a state where memory accesses match program order.
> + * (non-running threads are de facto in such a state)
> + *
> + * The current implementation simply executes a memory barrier in an IPI handler
> + * on each active cpu. Going through the hassle of taking run queue locks and
> + * checking if the thread running on each online CPU belongs to the current
> + * thread seems more heavyweight than the cost of the IPI itself.
> + */
> +SYSCALL_DEFINE0(membarrier)
> +{
> +	on_each_cpu(membarrier_ipi, NULL, 1);
> +
> +	return 0;
> +}
> +

Nice idea.  A few things come immediately to mind:

- If !CONFIG_SMP, this syscall should become (more of) a no-op.  Ideally
  even if CONFIG_SMP but running with one CPU.  (If you really wanted to
  go nuts, you could make it a vsyscall that did nothing with 1 CPU, to
  avoid the syscall overhead, but that seems like entirely too much
  trouble.)

- Have you tested what happens if a process does "while(1)
  membarrier();"?  By running on every CPU, including those not owned by
  the current process, this has the potential to make DoS easier,
  particularly on systems with many CPUs.  That gets even worse if a
  process forks multiple threads running that same loop.  Also consider
  that executing an IPI will do work even on a CPU currently running a
  real-time task.

- Rather than groveling through runqueues, could you somehow remotely
  check the value of current?  In theory, a race in doing so wouldn't
  matter; finding something other than the current process should mean
  you don't need to do a barrier, and finding the current process means
  you might need to do a barrier.

- Part of me thinks this ought to become slightly more general, and just
  deliver a signal that the receiving thread could handle as it likes.
  However, that would certainly prove more expensive than this, and I
  don't know that the generality would buy anything.

- Could you somehow register reader threads with the kernel, in a way
  that makes them easy to detect remotely?


- Josh Triplett
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/