lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20101022015845.GA6265@amd>
Date:	Fri, 22 Oct 2010 12:58:45 +1100
From:	Nick Piggin <npiggin@...nel.dk>
To:	Andrew Morton <akpm@...ux-foundation.org>
Cc:	Christoph Lameter <cl@...ux.com>,
	Eric Dumazet <eric.dumazet@...il.com>,
	Nick Piggin <npiggin@...nel.dk>,
	Dave Chinner <david@...morbit.com>,
	linux-fsdevel@...r.kernel.org, linux-kernel@...r.kernel.org
Subject: Re: [PATCH] percpu_counter : add percpu_counter_add_fast()

On Thu, Oct 21, 2010 at 06:55:16PM -0700, Andrew Morton wrote:
> On Thu, 21 Oct 2010 17:45:36 -0700 Andrew Morton <akpm@...ux-foundation.org> wrote:
> 
> > this_cpu_add_return() isn't really needed in this application.
> > 
> > {
> > 	this_cpu_add(*fbc->counters, amount);
> > 	if (unlikely(abs(this_cpu_read(*fbc->counters)) > fbc->batch))
> > 		out_of_line_stuff();
> > }	
> > 
> > will work just fine.
> 
> Did that.  Got alarmed at a few things.
> 
> The compiler cannot CSE the above code - it has to reload the percpu
> base each time.  Doing it by hand:
> 
> 
> {
> 	long *p;
> 
> 	p = this_cpu_ptr(fbc->counters);
> 	*p += amount;
> 	if (unlikely(abs(*p) > fbc->batch))
> 		out_of_line_stuff();
> }	
> 
> generates better code.
> 
> So this:
> 
> static __always_inline void percpu_counter_add_batch(struct percpu_counter *fbc,
> 						s64 amount, long batch)
> {
> 	long *pcounter;
> 
> 	preempt_disable();
> 	pcounter = this_cpu_ptr(fbc->counters);
> 	*pcounter += amount;
> 	if (unlikely(abs(*pcounter) >= batch))
> 		percpu_counter_handle_overflow(fbc);
> 	preempt_enable();
> }
> 
> when compiling this:
> 
> --- a/lib/proportions.c~b
> +++ a/lib/proportions.c
> @@ -263,6 +263,11 @@ void __prop_inc_percpu(struct prop_descr
>  	prop_put_global(pd, pg);
>  }
>  
> +void foo(struct prop_local_percpu *pl)
> +{
> +	percpu_counter_add(&pl->events, 1);
> +}
> +
>  /*
>   * identical to __prop_inc_percpu, except that it limits this pl's fraction to
>   * @frac/PROP_FRAC_BASE by ignoring events when this limit has been exceeded.
> 
> comes down to
> 
> .globl foo
> 	.type	foo, @function
> foo:
> 	pushq	%rbp	#
> 	movslq	percpu_counter_batch(%rip),%rcx	# percpu_counter_batch, batch
> 	movq	96(%rdi), %rdx	# <variable>.counters, tcp_ptr__
> 	movq	%rsp, %rbp	#,
> #APP
> 	add %gs:this_cpu_off, %rdx	# this_cpu_off, tcp_ptr__
> #NO_APP
> 	movq	(%rdx), %rax	#* tcp_ptr__, D.11817
> 	incq	%rax	# D.11817
> 	movq	%rax, (%rdx)	# D.11817,* tcp_ptr__
> 	cqto
> 	xorq	%rdx, %rax	# tmp67, D.11817
> 	subq	%rdx, %rax	# tmp67, D.11817
> 	cmpq	%rcx, %rax	# batch, D.11817
> 	jl	.L33	#,
> 	call	percpu_counter_handle_overflow	#
> .L33:
> 	leave
> 	ret
> 
> 
> But what's really alarming is that the compiler (4.0.2) is cheerily
> ignoring the inline directives and was generating out-of-line versions
> of most of the percpu_counter.h functions into lib/proportions.s. 
> That's rather a worry.

You you have the "ignore inlining" and "compile for size" turned
on? They often suck.

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ