linux-kernel - Re: [PATCH v6 3/4] printk: Userspace format indexing support

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <YLZPJpTC6YpJwAJk@chrisdown.name>
Date:   Tue, 1 Jun 2021 16:15:50 +0100
From:   Chris Down <chris@...isdown.name>
To:     Petr Mladek <pmladek@...e.com>
Cc:     linux-kernel@...r.kernel.org, Jessica Yu <jeyu@...nel.org>,
        Sergey Senozhatsky <sergey.senozhatsky@...il.com>,
        John Ogness <john.ogness@...utronix.de>,
        Steven Rostedt <rostedt@...dmis.org>,
        Greg Kroah-Hartman <gregkh@...uxfoundation.org>,
        Johannes Weiner <hannes@...xchg.org>,
        Kees Cook <keescook@...omium.org>,
        Andy Shevchenko <andriy.shevchenko@...ux.intel.com>,
        Rasmus Villemoes <linux@...musvillemoes.dk>, kernel-team@...com
Subject: Re: [PATCH v6 3/4] printk: Userspace format indexing support

Hey Petr,

Thanks for the feedback -- apologies for the delay in getting back.

Anything not explicitly replied to here is acked and will be fixed for v7.

Petr Mladek writes:
>Are you aware of any subsystem having some extra post_fmt, please?

Good thinking -- I just checked and I didn't find any, so I guess we can add 
that later if needed and do without it for now. :-)

>Well, the main question is whether we need to store this
>at all. Does it bring any useful information?
>
>Note that we do not store the value defined by pr_fmt() which
>is yet another automatically added prefix.

Hmm, we do actually store pr_fmt, since it's known at compile time (or we 
wouldn't store it at all if it was dynamic due to the __builtin_constant_p 
check). Or are you seeing something different?

>I would personally omit these prefixes. The most important
>information is:
>
>  + fmt: is the pattern that the system monitors would most likely
>	look for.
>
>  + level: says whether the string will appear on console that
>      shows only messages below a certain console_loglevel
>
>  + func, file, line: help to find the string in the kernel sources.
>      It is useful when investigating what has changed.
>
>For example, pre_fmt="%s %s:" used by dev_printk (4th patch)
>is not much useful.

In this case it's not super useful, but for cases where you can put something 
more useful (like "BTRFS: ...") it can make matching significantly less 
ambiguous.

Having this information means that system monitors know that they can anchor 
the matching at both ends when linking a printk format with some output, rather 
than having to look for a substring (in which case things become more difficult 
to determine).

I agree that in this particular case with dev_printk it's not incredibly 
useful, but there are certainly others we may want to have in future (like the 
btrfs case) where it would certainly help disambiguate. :-)

>
>> +} __packed;
>> +
>> +#define __printk_index_emit(_fmt, _level, _pre_fmt, _post_fmt)		       \
>> +	({								       \
>> +		if (__builtin_constant_p(_fmt) && __builtin_constant_p(_level)) { \
>> +			/*
>> +			 * The compiler may not be able to eliminate the
>> +			 * non-constant variants of _fmt and _level, so we need
>> +			 * to make sure that it doesn't see any hypothetical
>> +			 * assignment for non-constants even though this is
>> +			 * already inside the __builtin_constant_p guard.
>> +			 */						       \
>> +			static const struct pi_entry _entry		       \
>> +			__used = {		       \
>> +				.fmt = __builtin_constant_p(_fmt) ? (_fmt) : NULL, \
>> +				.func = __func__,			       \
>> +				.file = __FILE__,			       \
>> +				.line = __LINE__,			       \
>> +				.level = __builtin_constant_p(_level) ? (_level) : NULL, \
>> +				.pre_fmt = _pre_fmt,			       \
>
>Should this also be?
>
>	.pre_fmt = __builtin_constant_p(_pre_fmt) ? _pre_fmt : NULL
>

Hmm, if pre_fmt for a subsystem isn't known at compile time, it's an error on 
the developer's end -- otherwise things can't be stored anyway and there's no 
point.

>> + * pre and post must be known at compile time, or compilation will fail (since
>> + * this is a mistake). If fmt or level is not known at compile time, no index
>> + * entry will be made (since this can legitimately happen).
>
>How is this achieved, please?
>
>__printk_index_emit() creates the entry when the following check is true:
>
>       if (__builtin_constant_p(_fmt) && __builtin_constant_p(_level))
>
>It checks neither _pre_fmt nor _post_fmt.

Maybe my phrasing wasn't clear, but compilation would fail in that case because 
`.pre_fmt = _pre_fmt` will be a non-constant static initialiser, no?

>> +	if (!entry->fmt)
>> +		return 0;
>
>Is this just a paranoid check or is it a valid case?

It happens only for the following printk:

	arch/x86/mm/fault.c:534 show_fault_oops "BUG: kernel NULL pointer dereference, address: %px\n"

I must confess, I'm not entirely sure why, although it's ironic/interesting 
that it happens on _the_ NULL-related deref printk...

>Is is possible to update __printk_index_emit() to do not create
>entries with fmt = NULL ?
>
>We should either remove the above check or add a comment
>explaining why it is necessary.

The trick is unfortunately necessary for the reason described in the comment 
above the double check:

     /*
      * The compiler may not be able to eliminate the
      * non-constant variants of _fmt and _level, so we need
      * to make sure that it doesn't see any hypothetical
      * assignment for non-constants even though this is
      * already inside the __builtin_constant_p guard.
      */

Happy to add another comment if it helps.
>
>
>> +
>> +	if (entry->level)
>> +		printk_parse_prefix(entry->level, &level, &flags);
>> +	else
>> +		prefix_len = printk_parse_prefix(entry->fmt, &level, &flags);
>
>This is missing:
>
>	if (level == LOGLEVEL_DEFAULT)
>		level = default_message_loglevel;
>
>Without it, it produces lines with loglevel <-1>, for example:
>
><-1> init/do_mounts.c:457 mount_block_root "\n"
><-1> init/do_mounts.c:456 mount_block_root " %s"
><-1> init/do_mounts.c:454 mount_block_root "No filesystem could mount root, tried: "

Hmm, from the perspective of someone building this printk-checking 
infrastructure, I want -1 here instead of dynamically changing the output to 
default_message_loglevel, because default_message_loglevel may not be the same 
across systems. For example, having "-1" allows people to know these will come 
out at whatever the default message loglevel is, rather than necessarily being 
always at (say) level 4 or whatever it happens to be set at on the running 
system.

Without that, it's also not possible to compare entries across systems or 
across time, because the default loglevel may dynamically change at runtime, 
and it's not possible to disambiguate "will always be KERN_X" from "happens to 
be KERN_X" on this system.

And if someone really does want to know what value it will be right now, they 
can just look at /proc/sys/kernel/printk and change it out as desired.

>> +
>> +	seq_printf(s, "<%d%s> %s:%d %s \"",
>> +			level, flags & LOG_CONT ? ",c" : "", entry->file,
>> +			entry->line, entry->func);
>
>It produces the following for continuous lines:
>
><-1,c> arch/x86/events/core.c:2101 init_hw_perf_events "%s PMU driver.\n"
><-1,c> arch/x86/events/core.c:2091 init_hw_perf_events "no PMU driver, software events only.\n"
>
>But we should print the loglevel only when explicitly defined.
>So I would do it the following way:
>
>	if (flags & LOG_CONT) {
>		if (level == LOGLEVEL_DEFAULT)
>			seq_printf(s, "<c>");
>		else
>			seq_printf(s, "<%d,c>", level);
>	} else {
>		if (level == LOGLEVEL_DEFAULT)
>			level = default_message_loglevel;
>		seq_printf(s, "<%d>", level);
>	}
>
>	seq_printf(s, " %s:%d %s \"", entry->file, entry->line, entry->func);

Ah yes, that's a decent point, since we don't treat this as meaning 
LOGLEVEL_DEFAULT. I'll have a think about how to make it better for v7.

>I am really happy with the progress. The remaining problems are mostly
>with the support for the subsystem-specific printk() callers that was
>added in this revision.

That's great! I really appreciate your time and your diligent review, you've 
definitely helped improve this patchset for the better :-)

Thanks,

Chris