linux-kernel - Re: [PATCH] x86: Avoid intermixing cpu dump_stack output on multi-processor systems

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <20120529223923.GF32472@redhat.com>
Date:	Tue, 29 May 2012 18:39:23 -0400
From:	Don Zickus <dzickus@...hat.com>
To:	Russ Anderson <rja@....com>
Cc:	linux-kernel@...r.kernel.org, x86@...nel.org,
	Thomas Gleixner <tglx@...utronix.de>,
	Ingo Molnar <mingo@...hat.com>,
	"H. Peter Anvin" <hpa@...or.com>,
	Andrew Morton <akpm@...ux-foundation.org>,
	Greg Kroah-Hartman <gregkh@...uxfoundation.org>,
	rja@...ricas.sgi.com
Subject: Re: [PATCH] x86: Avoid intermixing cpu dump_stack output on
 multi-processor systems

On Tue, May 29, 2012 at 02:19:35PM -0500, Russ Anderson wrote:
> On Tue, May 29, 2012 at 01:53:53PM -0400, Don Zickus wrote:
> > On Thu, May 24, 2012 at 09:42:29AM -0500, Russ Anderson wrote:
> > > When multiple cpus on a multi-processor system call dump_stack()
> > > at the same time, the backtrace lines get intermixed, making 
> > > the output worthless.  Add a lock so each cpu stack dump comes
> > > out as a coherent set.
> > > 
> > > For example, when a multi-processor system is NMIed, all of the
> > > cpus call dump_stack() at the same time, resulting in output for
> > > all of cpus getting intermixed, making it impossible to tell what
> > > any individual cpu was doing.  With this patch each cpu prints
> > > its stack lines as a coherent set, so one can see what each cpu
> > > was doing.
> > 
> > For this particular test case, it sounds like you are doing what
> > trigger_all_cpu_backtrace() is doing?  It doesn't solve the general
> > problem, but probably your particular usage?
> 
> In this case, I am just using the hardware NMI, which sends the NMI
> signal to each logical cpu.  Since each cpu receives the NMI at nearly
> the exact same time, they end up in dump_stack() at the same time.
> Without some form of locking, trace lines from different cpus end
> up intermixed, making it impossible to tell what any individual 
> cpu was doing.

I forgot the original reasons for having the NMI go to each CPU instead of
just the boot CPU (commit 78c06176), but it seems like if you revert that
patch and have the nmi handler just call trigger_all_cpu_backtrace()
instead (which does stack trace locking for pretty output), that would
solve your problem, no?  That locking is safe because it is only called in
the NMI context.

Whereas the lock you are proposing can be called in a mixture of NMI and
IRQ which could cause deadlocks I believe.

Cheers,
Don
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/