linux-kernel - Re: [PATCH v6 2/2] Output stall data in debugfs

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <20110811190402.GC17530@redhat.com>
Date:	Thu, 11 Aug 2011 15:04:02 -0400
From:	Don Zickus <dzickus@...hat.com>
To:	Andi Kleen <andi@...stfloor.org>
Cc:	Alex Neronskiy <zakmagnus@...omium.org>,
	linux-kernel@...r.kernel.org, peterz@...radead.org,
	Ingo Molnar <mingo@...e.hu>,
	Mandeep Singh Baines <msb@...omium.org>,
	Alex Neronskiy <zakmagnus@...omium.com>
Subject: Re: [PATCH v6 2/2] Output stall data in debugfs

On Thu, Aug 11, 2011 at 11:48:26AM -0700, Andi Kleen wrote:
> Alex Neronskiy <zakmagnus@...omium.org> writes:
> 
> > From: Alex Neronskiy <zakmagnus@...omium.com>
> >
> > Instead of using the log, use debugfs for output of both stall
> > lengths and stack traces. Printing to the log can result in
> > watchdog touches, 
> 
> Why? Because of printk being slow or something else?

No because the serial console driver does a touch_nmi_watchdog().  So if
we are trying to output debug info _before_ the lockup detector goes off,
we effectively shoot ourselves in the foot by reseting the lockup detector
everytime we print something.  Hence we are trying to capture the data and
output using another interface.

> 
> The first could be probably workarounded, especially if you
> already have "two buffers"
> 
> > distorting the very events being measured.
> > Additionally, the information will not distract from lockups
> > when users view the log.
> >
> > A two-buffer system is used to ensure that the trace information
> > can always be recorded without contention.
> 
> This implies that kernel bug reports will often not contain the 
> back trace, right? Seems like a bad thing to me because it will
> make bug reports worse.

These are debug traces.  The real lockup traces will still print to the
console as they do today.  Nothing will change from that perspective.

What these patches do is give you insight into what part of your system is
coming close but not causing lockups.  If we have the hardlockup detector
set to warn or panic after 5 seconds of no interrupts, then these patches
can give you backtraces after 3 or 4 seconds (these traces might enable
interrupts after 4 seconds so no lock up occurs, but maybe something worth
noting).  It's just way to gather hueristics on system behaviour
regarding lockups.

Cheers,
Don

> 
> -Andi
> 
> -- 
> ak@...ux.intel.com -- Speaking for myself only
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/