lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Date:	Mon, 17 Jun 2013 11:20:23 -0700
From:	Tejun Heo <tj@...nel.org>
To:	Rusty Russell <rusty@...tcorp.com.au>
Cc:	"Paul E. McKenney" <paulmck@...ux.vnet.ibm.com>,
	Kent Overstreet <koverstreet@...gle.com>,
	linux-kernel@...r.kernel.org,
	Linus Torvalds <torvalds@...ux-foundation.org>,
	Andrew Morton <akpm@...ux-foundation.org>
Subject: Re: A question on RCU vs. preempt-RCU

Hello, Rusty.

On Sun, Jun 16, 2013 at 04:16:15PM +0930, Rusty Russell wrote:
> > For most use cases, the trade-off should be fine.  With any kind of
> > cross-cpu traffic, which there usually will be, it should be an easy
> > win for the percpu-refcount even when CONFIG_PREEMPT; however, I've
> > been looking to replace the module ref with the generic one and the
> > performance degradation there has low but existing possibility of
> > being noticeable in some edge use cases.
> 
> I'm confused: is it actually 10% slower than the existing module
> refcount code, or 10% slower than atomic inc?

Heh, sorry about the confusion.  I was comparing percpu_ref to
atomic_t and then worrying about the rcu flipping overhead as it
definitely seemed higher than flipping preemption.  As I wrote in a
reply to Paul, if I compare perpcu-ref with normal RCU against
RCU-sched, the performance difference is around 18% in favor of
RCU-sched.

> CONFIG_PREEMPT, now with more preempt!  Sure, that has a cost, but
> you're arguably fixing a bug.

It seems that using RCU-sched is the right flavor for perpcu_ref.  In
theory, we shouldn't see any performance degradation when converting
module ref to percpu_ref.

> If we want to improve CONFIG_PREEMPT performance, we can probably use a
> trick I wanted to try long ago:

So, this is a slight digression.

> 1) Use a per-cpu counter rather than a per-task counter for preempt.
> 2) Lay out preempt_counter so it covers NR_CPU pages, one per page.
> 3) When you want to preempt a CPU and counter isn't zero, make the page RO.
> 4) Handle preemption enable in the fault handler.
> 
> Then there's no branch in preempt_enable().

Buth yeah, interesting trick.  We'll be doing IPIs, flushing TLB and
taking faults until it hits zero.  It'll all depend on the frequency
of preemption but given that branches don't tend to be too expensive
on modern processors, maybe it'd be a bit too hairy for possibly
marginal gain?

Thanks.

-- 
tejun
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ