[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-Id: <20101021185516.be13a83f.akpm@linux-foundation.org>
Date: Thu, 21 Oct 2010 18:55:16 -0700
From: Andrew Morton <akpm@...ux-foundation.org>
To: Christoph Lameter <cl@...ux.com>,
Eric Dumazet <eric.dumazet@...il.com>,
Nick Piggin <npiggin@...nel.dk>,
Dave Chinner <david@...morbit.com>,
linux-fsdevel@...r.kernel.org, linux-kernel@...r.kernel.org
Subject: Re: [PATCH] percpu_counter : add percpu_counter_add_fast()
On Thu, 21 Oct 2010 17:45:36 -0700 Andrew Morton <akpm@...ux-foundation.org> wrote:
> this_cpu_add_return() isn't really needed in this application.
>
> {
> this_cpu_add(*fbc->counters, amount);
> if (unlikely(abs(this_cpu_read(*fbc->counters)) > fbc->batch))
> out_of_line_stuff();
> }
>
> will work just fine.
Did that. Got alarmed at a few things.
The compiler cannot CSE the above code - it has to reload the percpu
base each time. Doing it by hand:
{
long *p;
p = this_cpu_ptr(fbc->counters);
*p += amount;
if (unlikely(abs(*p) > fbc->batch))
out_of_line_stuff();
}
generates better code.
So this:
static __always_inline void percpu_counter_add_batch(struct percpu_counter *fbc,
s64 amount, long batch)
{
long *pcounter;
preempt_disable();
pcounter = this_cpu_ptr(fbc->counters);
*pcounter += amount;
if (unlikely(abs(*pcounter) >= batch))
percpu_counter_handle_overflow(fbc);
preempt_enable();
}
when compiling this:
--- a/lib/proportions.c~b
+++ a/lib/proportions.c
@@ -263,6 +263,11 @@ void __prop_inc_percpu(struct prop_descr
prop_put_global(pd, pg);
}
+void foo(struct prop_local_percpu *pl)
+{
+ percpu_counter_add(&pl->events, 1);
+}
+
/*
* identical to __prop_inc_percpu, except that it limits this pl's fraction to
* @frac/PROP_FRAC_BASE by ignoring events when this limit has been exceeded.
comes down to
.globl foo
.type foo, @function
foo:
pushq %rbp #
movslq percpu_counter_batch(%rip),%rcx # percpu_counter_batch, batch
movq 96(%rdi), %rdx # <variable>.counters, tcp_ptr__
movq %rsp, %rbp #,
#APP
add %gs:this_cpu_off, %rdx # this_cpu_off, tcp_ptr__
#NO_APP
movq (%rdx), %rax #* tcp_ptr__, D.11817
incq %rax # D.11817
movq %rax, (%rdx) # D.11817,* tcp_ptr__
cqto
xorq %rdx, %rax # tmp67, D.11817
subq %rdx, %rax # tmp67, D.11817
cmpq %rcx, %rax # batch, D.11817
jl .L33 #,
call percpu_counter_handle_overflow #
.L33:
leave
ret
But what's really alarming is that the compiler (4.0.2) is cheerily
ignoring the inline directives and was generating out-of-line versions
of most of the percpu_counter.h functions into lib/proportions.s.
That's rather a worry.
lib/proportions.o got rather larger as a result of inlining things and
it's not obvious that it's all a net benefit.
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
Powered by blists - more mailing lists