[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20120529223923.GF32472@redhat.com>
Date: Tue, 29 May 2012 18:39:23 -0400
From: Don Zickus <dzickus@...hat.com>
To: Russ Anderson <rja@....com>
Cc: linux-kernel@...r.kernel.org, x86@...nel.org,
Thomas Gleixner <tglx@...utronix.de>,
Ingo Molnar <mingo@...hat.com>,
"H. Peter Anvin" <hpa@...or.com>,
Andrew Morton <akpm@...ux-foundation.org>,
Greg Kroah-Hartman <gregkh@...uxfoundation.org>,
rja@...ricas.sgi.com
Subject: Re: [PATCH] x86: Avoid intermixing cpu dump_stack output on
multi-processor systems
On Tue, May 29, 2012 at 02:19:35PM -0500, Russ Anderson wrote:
> On Tue, May 29, 2012 at 01:53:53PM -0400, Don Zickus wrote:
> > On Thu, May 24, 2012 at 09:42:29AM -0500, Russ Anderson wrote:
> > > When multiple cpus on a multi-processor system call dump_stack()
> > > at the same time, the backtrace lines get intermixed, making
> > > the output worthless. Add a lock so each cpu stack dump comes
> > > out as a coherent set.
> > >
> > > For example, when a multi-processor system is NMIed, all of the
> > > cpus call dump_stack() at the same time, resulting in output for
> > > all of cpus getting intermixed, making it impossible to tell what
> > > any individual cpu was doing. With this patch each cpu prints
> > > its stack lines as a coherent set, so one can see what each cpu
> > > was doing.
> >
> > For this particular test case, it sounds like you are doing what
> > trigger_all_cpu_backtrace() is doing? It doesn't solve the general
> > problem, but probably your particular usage?
>
> In this case, I am just using the hardware NMI, which sends the NMI
> signal to each logical cpu. Since each cpu receives the NMI at nearly
> the exact same time, they end up in dump_stack() at the same time.
> Without some form of locking, trace lines from different cpus end
> up intermixed, making it impossible to tell what any individual
> cpu was doing.
I forgot the original reasons for having the NMI go to each CPU instead of
just the boot CPU (commit 78c06176), but it seems like if you revert that
patch and have the nmi handler just call trigger_all_cpu_backtrace()
instead (which does stack trace locking for pretty output), that would
solve your problem, no? That locking is safe because it is only called in
the NMI context.
Whereas the lock you are proposing can be called in a mixture of NMI and
IRQ which could cause deadlocks I believe.
Cheers,
Don
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
Powered by blists - more mailing lists