[<prev] [next>] [thread-next>] [day] [month] [year] [list]
Message-Id: <1444144919-25143-1-git-send-email-linux@rasmusvillemoes.dk>
Date: Tue, 6 Oct 2015 17:21:53 +0200
From: Rasmus Villemoes <linux@...musvillemoes.dk>
To: Rusty Russell <rusty@...tcorp.com.au>,
Greg Kroah-Hartman <gregkh@...uxfoundation.org>,
Oleg Nesterov <oleg@...hat.com>,
Thomas Gleixner <tglx@...utronix.de>
Cc: Michael Ellerman <mpe@...erman.id.au>,
Rasmus Villemoes <linux@...musvillemoes.dk>,
linuxppc-dev@...ts.ozlabs.org, linux-kernel@...r.kernel.org
Subject: [PATCH v2 0/6] kernel/cpu.c: eliminate some indirection
v2: fix build failure on ppc, add acks.
The four cpumasks cpu_{possible,online,present,active}_bits are
exposed readonly via the corresponding const variables
cpu_xyz_mask. But they are also accessible for arbitrary writing via
the exposed functions set_cpu_xyz. There's quite a bit of code
throughout the kernel which iterates over or otherwise accesses these
bitmaps, and having the access go via the cpu_xyz_mask variables is
nowadays [1] simply a useless indirection.
It may be that any problem in CS can be solved by an extra level of
indirection, but that doesn't mean every extra indirection solves a
problem. In this case, it even necessitates some minor ugliness (see
4/6).
Patch 1/6 is new in v2, and fixes a build failure on ppc by renaming a
struct member, to avoid problems when the identifier cpu_online_mask
becomes a macro later in the series. The next four patches eliminate
the cpu_xyz_mask variables by simply exposing the actual bitmaps,
after renaming them to discourage direct access - that still happens
through cpu_xyz_mask, which are now simply macros with the same type
and value as they used to have.
After that, there's no longer any reason to have the setter functions
be out-of-line: The boolean parameter is almost always a literal true
or false, so by making them static inlines they will usually compile
to one or two instructions.
For a defconfig build on x86_64, bloat-o-meter says we save ~3000
bytes. We also save a little stack (stackdelta says 127 functions have
a 16 byte smaller stack frame, while two grow by that amount). Mostly
because, when iterating over the mask, gcc typically loads the value
of cpu_xyz_mask into a callee-saved register and from there into %rdi
before each find_next_bit call - now it can just load the appropriate
immediate address into %rdi before each call.
[1] See Rusty's kind explanation
http://thread.gmane.org/gmane.linux.kernel/2047078/focus=2047722 for
some historic context.
Rasmus Villemoes (6):
powerpc/fadump: rename cpu_online_mask member of struct
fadump_crash_info_header
kernel/cpu.c: change type of cpu_possible_bits and friends
kernel/cpu.c: export __cpu_*_mask
drivers/base/cpu.c: use __cpu_*_mask directly
kernel/cpu.c: eliminate cpu_*_mask
kernel/cpu.c: make set_cpu_* static inlines
arch/powerpc/include/asm/fadump.h | 2 +-
arch/powerpc/kernel/fadump.c | 4 +--
drivers/base/cpu.c | 10 +++---
include/linux/cpumask.h | 55 ++++++++++++++++++++++++++++-----
kernel/cpu.c | 64 ++++++++-------------------------------
5 files changed, 68 insertions(+), 67 deletions(-)
--
2.1.3
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
Powered by blists - more mailing lists