linux-kernel - Re: [RFC PATCH 00/11] printk: safe printing in NMI context

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <alpine.LNX.2.00.1406101917410.1321@pobox.suse.cz>
Date:	Tue, 10 Jun 2014 19:32:51 +0200 (CEST)
From:	Jiri Kosina <jkosina@...e.cz>
To:	Linus Torvalds <torvalds@...ux-foundation.org>
cc:	Frederic Weisbecker <fweisbec@...il.com>,
	Petr Mladek <pmladek@...e.cz>,
	Andrew Morton <akpm@...ux-foundation.org>,
	Steven Rostedt <rostedt@...dmis.org>,
	Dave Anderson <anderson@...hat.com>,
	"Paul E. McKenney" <paulmck@...ux.vnet.ibm.com>,
	Kay Sievers <kay@...y.org>, Michal Hocko <mhocko@...e.cz>,
	Jan Kara <jack@...e.cz>,
	Linux Kernel Mailing List <linux-kernel@...r.kernel.org>
Subject: Re: [RFC PATCH 00/11] printk: safe printing in NMI context

On Tue, 10 Jun 2014, Linus Torvalds wrote:

> > Lets be crazy and Cc Linus on that.
> 
> Quite frankly, I hate seeing something like this:
> 
>  kernel/printk/printk.c              | 1218 +++++++++++++++++++++++++----------
> 
> for something that is stupid and broken. Printing from NMI context
> isn't really supposed to work, and we all *know* it's not supposed to
> work.

It's OTOH rather useful in a few scenarios -- particularly it's the only 
way to dump stacktraces from remote CPUs in order to obtain traces that 
actually make sense (in situations like RCU stall); using workqueue-based 
dumping is useless there.

> I'd much rather disallow it, and if there is one or two places that
> really want to print a warning and know that they are in NMI context,
> have a special workaround just for them, with something that does
> *not* try to make printk in general work any better.

Well, that'd mean that at least our stack dumping mechanism would need to 
know both ways of printing; but yes, it'll still probably be less than 880 
lines added.

> Dammit, NMI context is special. I absolutely refuse to buy into the
> broken concept that we should make more stuff work in NMI context.
> Hell no, we should *not* try to make more crap work in NMI. NMI people
> should be careful.

In parallel, I'd for the sake of argument propose to just drop the whole 
_CONT printing (and all the things that followed on top) as that made 
printk() a complete hell to maintain for a disputable gain IMO.

> Make a trivial "printk_nmi()" wrapper that tries to do a trylock on
> logbuf_lock, and *maybe* the existing sequence of
> 
>         if (console_trylock_for_printk())
>                 console_unlock();
> 
> then works for actually triggering the printout. But the wrapper
> should be 15 lines of code for "if possible, try to print things", and
> *not* a thousand lines of changes.

Well, we are carrying much simpler fix for this whole braindamage in our 
enterprise kernel that is from pre-7ff9554bb578 era, and it was rather 
simple fix in principle (the diffstat is much larger than it had to be due 
to code movements):

	http://kernel.suse.com/cgit/kernel/commit/?h=SLE11-SP3&id=8d62ae68ff61d77ae3c4899f05dbd9c9742b14c9

But after the scary 7ff9554bb578 and its successors, things got a lot more 
complicated.

-- 
Jiri Kosina
SUSE Labs
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/