linux-kernel - Re: [RFC PATCH 00/11] printk: safe printing in NMI context

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <20140529000909.GC6507@localhost.localdomain>
Date:	Thu, 29 May 2014 02:09:11 +0200
From:	Frederic Weisbecker <fweisbec@...il.com>
To:	Jiri Kosina <jkosina@...e.cz>
Cc:	Petr Mladek <pmladek@...e.cz>,
	Andrew Morton <akpm@...ux-foundation.org>,
	Steven Rostedt <rostedt@...dmis.org>,
	Dave Anderson <anderson@...hat.com>,
	"Paul E. McKenney" <paulmck@...ux.vnet.ibm.com>,
	Kay Sievers <kay@...y.org>, Michal Hocko <mhocko@...e.cz>,
	Jan Kara <jack@...e.cz>, linux-kernel@...r.kernel.org
Subject: Re: [RFC PATCH 00/11] printk: safe printing in NMI context

On Thu, May 29, 2014 at 12:02:30AM +0200, Jiri Kosina wrote:
> On Fri, 9 May 2014, Petr Mladek wrote:
> 
> > printk() cannot be used safely in NMI context because it uses internal locks
> > and thus could cause a deadlock. Unfortunately there are circumstances when
> > calling printk from NMI is very useful. For example, all WARN.*(in_nmi())
> > would be much more helpful if they didn't lockup the machine.
> > 
> > Another example would be arch_trigger_all_cpu_backtrace for x86 which uses NMI
> > to dump traces on all CPU (either triggered by sysrq+l or from RCU stall
> > detector).
> 
> I am rather surprised that this patchset hasn't received a single review 
> comment for 3 weeks.
> 
> Let me point out that the issues Petr is talking about in the cover letter 
> are real -- we've actually seen the lockups triggered by RCU stall 
> detector trying to dump stacks on all CPUs, and hard-locking machine up 
> while doing so.
> 
> So this really needs to be solved.

The lack of review may be partly due to a not very appealing changestat on an
old codebase that is already unpopular:

 Documentation/kernel-parameters.txt |   19 +-
 kernel/printk/printk.c              | 1218 +++++++++++++++++++++++++----------
 2 files changed, 878 insertions(+), 359 deletions(-)


Your patches look clean and pretty nice actually. They must be seriously considered if
we want to keep the current locked ring buffer design and extend it to multiple per context
buffers. But I wonder if it's worth to continue that way with the printk ancient design.

If it takes more than 1000 line changes (including 500 added) to make it finally work
correctly with NMIs by working around its fundamental flaws, shouldn't we rather redesign
it to use a lockless ring buffer like ftrace or perf ones?
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/