linux-kernel - Re: [PATCH 0/4] improvements to the nmi

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <20160301100131.GN3305@pathway.suse.cz>
Date:	Tue, 1 Mar 2016 11:01:31 +0100
From:	Petr Mladek <pmladek@...e.com>
To:	Andrew Morton <akpm@...ux-foundation.org>
Cc:	Chris Metcalf <cmetcalf@...hip.com>,
	Russell King <linux@....linux.org.uk>,
	Thomas Gleixner <tglx@...utronix.de>,
	Aaron Tomlin <atomlin@...hat.com>,
	Ingo Molnar <mingo@...hat.com>,
	Daniel Thompson <daniel.thompson@...aro.org>, x86@...nel.org,
	linux-arm-kernel@...ts.infradead.org, linux-kernel@...r.kernel.org
Subject: Re: [PATCH 0/4] improvements to the nmi_backtrace code

On Mon 2016-02-29 16:49:56, Andrew Morton wrote:
> On Mon, 29 Feb 2016 16:40:20 -0500 Chris Metcalf <cmetcalf@...hip.com> wrote:
> 
> > This patch series modifies the trigger_xxx_backtrace() NMI-based
> > remote backtracing code to make it more flexible, and makes a few
> > small improvements along the way.
> > 
> > The motivation comes from the task isolation code, where there are
> > scenarios where we want to be able to diagnose a case where some cpu
> > is about to interrupt a task-isolated cpu.  It can be helpful to
> > see both where the interrupting cpu is, and also an approximation
> > of where the cpu that is being interrupted is.  The nmi_backtrace
> > framework allows us to discover the stack of the interrupted cpu.
> > 
> > The first change adds support for trigger_single_cpu_backtrace(), and
> > as an "API side-effect", trigger_cpumask_backtrace().  The underlying
> > abstraction is changed to use cpumasks instead of a "bool except_self".
> > 
> > The second and third changes provide small improvements to the
> > behavior of the existing nmi_backtrace code: omitting full backtrace
> > dumps for idle cores, and doing local dump_stack backtraces when we
> > try to do a "remote" dump of the local core.  Some of this reflects
> > changes from integrating the arch/tile code into the generic code.
> > 
> > The fourth change hooks the arch/tile backtrace mechanism into
> > the nmi_backtrace code to share code and take advantage of other
> > improvements of nmi_backtrace not present in the original arch/tile
> > code, like co-opting printk to use local buffers instead of just
> > spewing to the console and hoping for the best.
> > 
> > The changes have been runtime tested on tile, and build-tested on
> > x86 and arm.
> 
> The patchset looks rather nice but unfortuntely conflicts pretty
> significantly with Petr's "Cleaning printk stuff in NMI context"
> patchset:
> 
> http://ozlabs.org/~akpm/mmots/broken-out/printk-nmi-generic-solution-for-safe-printk-in-nmi.patch
> http://ozlabs.org/~akpm/mmots/broken-out/printk-nmi-use-irq-work-only-when-ready.patch
> http://ozlabs.org/~akpm/mmots/broken-out/printk-nmi-warn-when-some-message-has-been-lost-in-nmi-context.patch
> http://ozlabs.org/~akpm/mmots/broken-out/printk-nmi-increase-the-size-of-nmi-buffer-and-make-it-configurable.patch
> 
> Could we please have a think about what to do about this?
> 
> Petr's patchset does have a few outstanding issues (a bug reported by
> Sergey Senozhatsky and noncommittal review comments from Daniel
> Thompson) so one approach would be to merge this (Chris's) patchset
> (which looks rather more straightforward) and to ask Petr to rebase
> things on top once he gets back onto his work.

Sounds reasonable. Let's handle Chris's patchset first. I am
playing with the panic and could rebase the patchset
when resending.

Best Regards,
Petr