lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20130924160359.GA2739@redhat.com>
Date:	Tue, 24 Sep 2013 18:03:59 +0200
From:	Oleg Nesterov <oleg@...hat.com>
To:	Peter Zijlstra <peterz@...radead.org>
Cc:	Mel Gorman <mgorman@...e.de>, Rik van Riel <riel@...hat.com>,
	Srikar Dronamraju <srikar@...ux.vnet.ibm.com>,
	Ingo Molnar <mingo@...nel.org>,
	Andrea Arcangeli <aarcange@...hat.com>,
	Johannes Weiner <hannes@...xchg.org>,
	Linux-MM <linux-mm@...ck.org>,
	LKML <linux-kernel@...r.kernel.org>,
	Paul McKenney <paulmck@...ux.vnet.ibm.com>,
	Thomas Gleixner <tglx@...utronix.de>,
	Steven Rostedt <rostedt@...dmis.org>
Subject: Re: [PATCH] hotplug: Optimize {get,put}_online_cpus()

On 09/24, Peter Zijlstra wrote:
>
> +static inline void get_online_cpus(void)
> +{
> +	might_sleep();
> +
> +	if (current->cpuhp_ref++) {
> +		barrier();
> +		return;

I don't undestand this barrier()... we are going to return if we already
hold the lock, do we really need it?

The same for put_online_cpus().

> +void __get_online_cpus(void)
>  {
> -	if (cpu_hotplug.active_writer == current)
> +	if (cpuhp_writer_task == current)
>  		return;

Probably it would be better to simply inc/dec ->cpuhp_ref in
cpu_hotplug_begin/end and remove this check here and in
__put_online_cpus().

This also means that the writer doing get/put_online_cpus() will
always use the fast path, and __cpuhp_writer can go away,
cpuhp_writer_task != NULL can be used instead.

> +     atomic_inc(&cpuhp_waitcount);
> +
> +     /*
> +      * We either call schedule() in the wait, or we'll fall through
> +      * and reschedule on the preempt_enable() in get_online_cpus().
> +      */
> +     preempt_enable_no_resched();
> +     wait_event(cpuhp_wq, !__cpuhp_writer);
> +     preempt_disable();
> +
> +     /*
> +      * It would be possible for cpu_hotplug_done() to complete before
> +      * the atomic_inc() above; in which case there is no writer waiting
> +      * and doing a wakeup would be BAD (tm).
> +      *
> +      * If however we still observe cpuhp_writer_task here we know
> +      * cpu_hotplug_done() is currently stuck waiting for cpuhp_waitcount.
> +      */
> +     if (atomic_dec_and_test(&cpuhp_waitcount) && cpuhp_writer_task)
> +             cpuhp_writer_wake();

cpuhp_writer_wake() here and in __put_online_cpus() looks racy...
Not only cpuhp_writer_wake() can hit cpuhp_writer_task == NULL (we need
something like ACCESS_ONCE()), its task_struct can be already freed/reused
if the writer exits.

And I don't really understand the logic... This slow path succeds without
incrementing any counter (except current->cpuhp_ref)? How the next writer
can notice the fact it should wait for this reader?

>  void cpu_hotplug_done(void)
>  {
> -	cpu_hotplug.active_writer = NULL;
> -	mutex_unlock(&cpu_hotplug.lock);
> +	/* Signal the writer is done */
> +	cpuhp_writer = 0;
> +	wake_up_all(&cpuhp_wq);
> +
> +	/* Wait for any pending readers to be running */
> +	cpuhp_writer_wait(!atomic_read(&cpuhp_waitcount));
> +	cpuhp_writer_task = NULL;

We also need to ensure that the next reader should see all changes
done by the writer, iow this lacks "realease" semantics.




But, Peter, the main question is, why this is better than
percpu_rw_semaphore performance-wise? (Assuming we add
task_struct->cpuhp_ref).

If the writer is pending, percpu_down_read() does

	down_read(&brw->rw_sem);
	atomic_inc(&brw->slow_read_ctr);
	__up_read(&brw->rw_sem);

is it really much worse than wait_event + atomic_dec_and_test?

And! please note that with your implementation the new readers will
be likely blocked while the writer sleeps in synchronize_sched().
This doesn't happen with percpu_rw_semaphore.

Oleg.

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ