[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20130912115155.GV31370@twins.programming.kicks-ass.net>
Date: Thu, 12 Sep 2013 13:51:55 +0200
From: Peter Zijlstra <peterz@...radead.org>
To: Linus Torvalds <torvalds@...ux-foundation.org>
Cc: Ingo Molnar <mingo@...nel.org>, Andi Kleen <ak@...ux.intel.com>,
Peter Anvin <hpa@...or.com>,
Mike Galbraith <bitbucket@...ine.de>,
Thomas Gleixner <tglx@...utronix.de>,
Arjan van de Ven <arjan@...ux.intel.com>,
Frederic Weisbecker <fweisbec@...il.com>,
Linux Kernel Mailing List <linux-kernel@...r.kernel.org>,
"linux-arch@...r.kernel.org" <linux-arch@...r.kernel.org>
Subject: Re: [PATCH 0/7] preempt_count rework -v2
On Wed, Sep 11, 2013 at 07:43:42PM -0700, Linus Torvalds wrote:
> On Wed, Sep 11, 2013 at 7:20 PM, Peter Zijlstra <peterz@...radead.org> wrote:
> >
> > I split the thing up into two macros GEN_UNARY_RMWcc and
> > GEN_BINARY_RMWcc which ends up being more readable as well as smaller
> > code overall.
>
> Yes, that looks like the right abstraction level. Powerful without
> being complicated.
> That said, the very same memory clobber may be what makes this whole
> approach a loss, if it causes gcc to do tons of reloads or other
> random things.
Random things below, not sure why
> For other cases? Who knows.. A lot of the "change and test atomically"
> things have the same containment need, so it might not be a problem.
Right, all atomic ops already contain a memory clobber to go along with
their memory barrier semantics.
So I did a defconfig build without the patch and with the patch and got
a significant size increase:
text data bss filename
11173443 1423736 1183744 defconfig-build/vmlinux.before
11174516 1423736 1183744 defconfig-build/vmlinux.after
I then undid all the bitop conversions that added the extra memory
clobber and got:
11174204 1423736 1183744 defconfig-build/vmlinux.after
Which is still a significant increase, so I had a look at what GCC
generates, for mm/rmap.c, which uses quite a few atomic_*_and_test(),
functions I got:
text data bss dec hex filename
8675 1 16 8692 21f4 defconfig-build/mm/rmap.o
8739 1 16 8756 2234 defconfig-build/mm/rmap.o
So the increase is there too, doing a objdump -D on them the first
difference is:
0000000000000660 <do_page_add_anon_rmap>:
660: 55 push %rbp
661: 48 89 e5 mov %rsp,%rbp
664: 48 83 ec 20 sub $0x20,%rsp
668: 48 89 5d f0 mov %rbx,-0x10(%rbp)
66c: 4c 89 65 f8 mov %r12,-0x8(%rbp)
670: 48 89 fb mov %rdi,%rbx
673: f0 ff 47 18 lock incl 0x18(%rdi)
677: 0f 94 c0 sete %al
67a: 84 c0 test %al,%al
67c: 75 12 jne 690 <do_page_add_anon_rmap+0x30>
67e: 48 8b 5d f0 mov -0x10(%rbp),%rbx
682: 4c 8b 65 f8 mov -0x8(%rbp),%r12
686: c9 leaveq
vs.:
0000000000000660 <do_page_add_anon_rmap>:
660: 55 push %rbp
661: 48 89 e5 mov %rsp,%rbp
664: 48 83 ec 20 sub $0x20,%rsp
668: 48 89 5d e0 mov %rbx,-0x20(%rbp)
66c: 4c 89 65 e8 mov %r12,-0x18(%rbp)
670: 48 89 fb mov %rdi,%rbx
673: 4c 89 6d f0 mov %r13,-0x10(%rbp)
677: 4c 89 75 f8 mov %r14,-0x8(%rbp)
67b: f0 ff 47 18 lock incl 0x18(%rdi)
67f: 74 17 je 698 <do_page_add_anon_rmap+0x38>
681: 48 8b 5d e0 mov -0x20(%rbp),%rbx
685: 4c 8b 65 e8 mov -0x18(%rbp),%r12
689: 4c 8b 6d f0 mov -0x10(%rbp),%r13
68d: 4c 8b 75 f8 mov -0x8(%rbp),%r14
691: c9 leaveq
For some obscure (to me) reason the new fangled asm goto construct
generates a bunch of extra MOVs.
OTOH the good news is that a kernel with that patch applied does indeed
boot properly on real hardware.
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
Powered by blists - more mailing lists