[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20140822072023.GA7218@gmail.com>
Date: Fri, 22 Aug 2014 09:20:23 +0200
From: Ingo Molnar <mingo@...nel.org>
To: Dave Hansen <dave@...1.net>
Cc: linux-kernel@...r.kernel.org, dave.hansen@...ux.intel.com,
peterz@...radead.org, mingo@...hat.com, ak@...ux.intel.com,
tim.c.chen@...ux.intel.com, akpm@...ux-foundation.org,
cl@...ux.com, penberg@...nel.org, linux-mm@...ck.org,
kirill@...temov.name, lauraa@...eaurora.org,
Linus Torvalds <torvalds@...ux-foundation.org>,
Peter Zijlstra <a.p.zijlstra@...llo.nl>,
Thomas Gleixner <tglx@...utronix.de>
Subject: Re: [PATCH] [v3] warn on performance-impacting configs aka.
TAINT_PERFORMANCE
* Dave Hansen <dave@...1.net> wrote:
> From: Dave Hansen <dave.hansen@...ux.intel.com>
>
> Changes from v2:
> * remove tainting and stack track
> * add debugfs file
> * added a little text to guide folks who want to add more
> options
Looks good to me conceptually.
A couple of minor details:
> Changes from v1:
> * remove schedstats
> * add DEBUG_PAGEALLOC and SLUB_DEBUG_ON
>
> --
>
> I have more than once myself been the victim of an accidentally-
> enabled kernel config option being mistaken for a true
> performance problem.
>
> I'm sure I've also taken profiles or performance measurements
> and assumed they were real-world when really I was measuing the
s/measuring
> performance with an option that nobody turns on in production.
>
> A warning like this late in boot will help remind folks when
> these kinds of things are enabled. We can also teach tooling to
> look for and capture /sys/kernel/debug/config_debug .
>
> As for the patch...
>
> I originally wanted this for CONFIG_DEBUG_VM, but I think it also
> applies to things like lockdep and slab debugging. See the patch
> for the list of offending config options. I'm open to adding
> more, but this seemed like a good list to start.
>
> The compiler is smart enough to really trim down the code when
> the array is empty. An objdump -d looks like this:
>
> lib/perf-configs.o: file format elf64-x86-64
>
> Disassembly of section .init.text:
>
> 0000000000000000 <performance_taint>:
> 0: 55 push %rbp
> 1: 31 c0 xor %eax,%eax
> 3: 48 89 e5 mov %rsp,%rbp
> 6: 5d pop %rbp
> 7: c3 retq
So I guess the _taint bit is obsolete now?
> This could be done with Kconfig and an #ifdef to save us 8 bytes
> of text and the entry in the late_initcall() section. Doing it
> this way lets us keep the list of these things in one spot, and
> also gives us a convenient way to dump out the name of the
> offending option.
>
> For anybody that *really* cares, I put the whole thing under
> CONFIG_DEBUG_KERNEL in the Makefile.
>
> The messages look like this:
>
> [ 3.865297] WARNING: Do not use this kernel for performance measurement.
I'd warn this way:
[ 3.865297] INFO: Be careful when using this kernel for performance measurement.
> [ 3.868776] WARNING: Potentially performance-altering options:
> [ 3.871558] CONFIG_LOCKDEP enabled
> [ 3.873326] CONFIG_SLUB_DEBUG_ON enabled
And here I'd print this the following way:
> [ 3.868776] INFO: Potentially performance-altering options:
> [ 3.871558] CONFIG_LOCKDEP=y
> [ 3.873326] CONFIG_SLUB_DEBUG_ON=y
The 'INFO:' prefix is less pushy, and the '=y' notation is what
people will look for in the .config anyway, so lets keep to
that?
(Btw., you probably want to check CONFIG_PROVE_LOCKING=y, not
CONFIG_LOCKDEP - the former is the user configurable one.)
>
> Signed-off-by: Dave Hansen <dave.hansen@...ux.intel.com>
> Cc: Peter Zijlstra <peterz@...radead.org>
> Cc: Ingo Molnar <mingo@...hat.com>
> Cc: ak@...ux.intel.com
> Cc: tim.c.chen@...ux.intel.com
> Cc: Andrew Morton <akpm@...ux-foundation.org>
> Cc: Christoph Lameter <cl@...ux.com>
> Cc: Pekka Enberg <penberg@...nel.org>
> Cc: linux-kernel@...r.kernel.org
> Cc: linux-mm@...ck.org
> Cc: kirill@...temov.name
> Cc: lauraa@...eaurora.org
> ---
>
> b/include/linux/kernel.h | 1
> b/kernel/panic.c | 1
> b/lib/Makefile | 1
> b/lib/perf-configs.c | 114 +++++++++++++++++++++++++++++++++++++++++++++++
> 4 files changed, 117 insertions(+)
>
> diff -puN include/linux/kernel.h~taint-performance include/linux/kernel.h
> --- a/include/linux/kernel.h~taint-performance 2014-08-19 11:38:07.424005355 -0700
> +++ b/include/linux/kernel.h 2014-08-19 11:38:20.960615904 -0700
> @@ -471,6 +471,7 @@ extern enum system_states {
> #define TAINT_OOT_MODULE 12
> #define TAINT_UNSIGNED_MODULE 13
> #define TAINT_SOFTLOCKUP 14
> +#define TAINT_PERFORMANCE 15
That's unnecessary now.
>
> extern const char hex_asc[];
> #define hex_asc_lo(x) hex_asc[((x) & 0x0f)]
> diff -puN kernel/panic.c~taint-performance kernel/panic.c
> --- a/kernel/panic.c~taint-performance 2014-08-19 11:38:28.928975233 -0700
> +++ b/kernel/panic.c 2014-08-20 09:56:29.528471033 -0700
> @@ -225,6 +225,7 @@ static const struct tnt tnts[] = {
> { TAINT_OOT_MODULE, 'O', ' ' },
> { TAINT_UNSIGNED_MODULE, 'E', ' ' },
> { TAINT_SOFTLOCKUP, 'L', ' ' },
> + { TAINT_PERFORMANCE, 'Q', ' ' },
Ditto.
> };
>
> /**
> diff -puN /dev/null lib/perf-configs.c
> --- /dev/null 2014-04-10 11:28:14.066815724 -0700
> +++ b/lib/perf-configs.c 2014-08-21 13:22:25.586598278 -0700
> @@ -0,0 +1,114 @@
> +#include <linux/bug.h>
> +#include <linux/debugfs.h>
> +#include <linux/gfp.h>
> +#include <linux/kernel.h>
> +#include <linux/slab.h>
> +
> +/*
> + * This should list any kernel options that can substantially
> + * affect performance. This is intended to give a loud
> + * warning during bootup so that folks have a fighting chance
> + * of noticing these things.
> + *
> + * This is fairly subjective, but a good rule of thumb for these
> + * is: if it is enabled widely in production, then it does not
> + * belong here. If a major enterprise kernel enables a feature
> + * for a non-debug kernel, it _really_ does not belong.
> + */
> +static const char * const perfomance_killing_configs[] = {
> +#ifdef CONFIG_LOCKDEP
> + "LOCKDEP",
> +#endif
> +#ifdef CONFIG_LOCK_STAT
> + "LOCK_STAT",
> +#endif
> +#ifdef CONFIG_DEBUG_VM
> + "DEBUG_VM",
> +#endif
> +#ifdef CONFIG_DEBUG_VM_VMACACHE
> + "DEBUG_VM_VMACACHE",
> +#endif
> +#ifdef CONFIG_DEBUG_VM_RB
> + "DEBUG_VM_RB",
> +#endif
> +#ifdef CONFIG_DEBUG_SLAB
> + "DEBUG_SLAB",
> +#endif
> +#ifdef CONFIG_SLUB_DEBUG_ON
> + "SLUB_DEBUG_ON",
> +#endif
> +#ifdef CONFIG_DEBUG_OBJECTS_FREE
> + "DEBUG_OBJECTS_FREE",
> +#endif
Essentially all DEBUG_OBJECTS_* options are expensive, assuming
they are enabled, i.e. DEBUG_OBJECTS_ENABLE_DEFAULT=y.
Otherwise they should only be warned about if the debugobjects
boot option got enabled.
I.e. you'll need a bit of a runtime check for this one.
> +#ifdef CONFIG_DEBUG_KMEMLEAK
> + "DEBUG_KMEMLEAK",
> +#endif
> +#ifdef CONFIG_DEBUG_PAGEALLOC
> + "DEBUG_PAGEALLOC",
> +#endif
I'd also add KMEMCHECK.
> +};
> +
> +static const char config_prefix[] = "CONFIG_";
> +/*
> + * Dump out the list of the offending config options to a file
> + * in debugfs so that tooling can look for and capture it.
> + */
> +static ssize_t performance_taint_read(struct file *file, char __user *user_buf,
> + size_t count, loff_t *ppos)
> +{
> + int i;
> + int ret;
> + char *buf;
> + size_t buf_written = 0;
> + size_t buf_left;
> + size_t buf_len;
> +
> + if (!ARRAY_SIZE(perfomance_killing_configs))
> + return 0;
> +
> + buf_len = 1;
> + for (i = 0; i < ARRAY_SIZE(perfomance_killing_configs); i++)
> + buf_len += strlen(config_prefix) +
> + strlen(perfomance_killing_configs[i]);
> + /* Add a byte for for each entry in the array for a \n */
> + buf_len += ARRAY_SIZE(perfomance_killing_configs);
> +
> + buf = kmalloc(buf_len, GFP_KERNEL);
> + if (!buf)
> + return -ENOMEM;
> +
> + buf_left = buf_len;
> + for (i = 0; i < ARRAY_SIZE(perfomance_killing_configs); i++) {
> + buf_written += snprintf(buf + buf_written, buf_left,
> + "%s%s\n", config_prefix,
> + perfomance_killing_configs[i]);
> + buf_left = buf_len - buf_written;
So, ARRAY_SIZE(performance_killing_configs) is written out four
times, a temporary variable would be in order I suspect.
Also, do you want to check buf_left and break out early from
the loop if it goes non-positive?
> + }
> + ret = simple_read_from_buffer(user_buf, buf_written, ppos, buf, buf_len);
> + kfree(buf);
> + return ret;
> +}
> +
> +static const struct file_operations fops_perf_taint = {
> + .read = performance_taint_read,
> + .llseek = default_llseek,
> +};
> +
> +static int __init performance_taint(void)
I'd not name this 'taint' anymore, but check_configs() or so.
> +{
> + int i;
> +
> + if (!ARRAY_SIZE(perfomance_killing_configs))
> + return 0;
and: s/perfomance/performance across the whole file.
> +
> + pr_warn("WARNING: Do not use this kernel for performance measurement.\n");
> + pr_warn("WARNING: Potentially performance-altering options:\n");
> + for (i = 0; i < ARRAY_SIZE(perfomance_killing_configs); i++) {
> + pr_warn("\t%s%s enabled\n", config_prefix,
> + perfomance_killing_configs[i]);
> + }
> + debugfs_create_file("config_debug", S_IRUSR | S_IWUSR,
> + NULL, NULL, &fops_perf_taint);
> + return 0;
> +}
> +late_initcall(performance_taint);
> diff -puN lib/Makefile~taint-performance lib/Makefile
> --- a/lib/Makefile~taint-performance 2014-08-20 11:02:54.130548350 -0700
> +++ b/lib/Makefile 2014-08-20 11:06:18.231744868 -0700
> @@ -54,6 +54,7 @@ obj-$(CONFIG_GENERIC_HWEIGHT) += hweight
> obj-$(CONFIG_BTREE) += btree.o
> obj-$(CONFIG_INTERVAL_TREE) += interval_tree.o
> obj-$(CONFIG_ASSOCIATIVE_ARRAY) += assoc_array.o
> +obj-$(CONFIG_DEBUG_KERNEL) += perf-configs.o
Please don't name it perf-*.c, that confuses it with perf
events.
Maybe name it check-configs.c or so?
Thanks,
Ingo
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
Powered by blists - more mailing lists