lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <1307190166.23387.15.camel@t41.thuisdomein>
Date:	Sat, 04 Jun 2011 14:22:26 +0200
From:	Paul Bolle <pebolle@...cali.nl>
To:	Vivek Goyal <vgoyal@...hat.com>
Cc:	"Paul E. McKenney" <paulmck@...ux.vnet.ibm.com>,
	Jens Axboe <jaxboe@...ionio.com>,
	linux kernel mailing list <linux-kernel@...r.kernel.org>
Subject: Re: Mysterious CFQ crash and RCU

On Fri, 2011-06-03 at 09:45 -0400, Vivek Goyal wrote:
> PaulB mentioned that crash happened at May 26 10:47:07. I am wondering
> how are we able to sample the data after the crash. I am assuming
> that above data gives information only before crash and does not
> tell us anything about what happened just before crash. What am I missing.

Well, what you called a "CFQ crash" is an Oops (apparently generated by
arch/x86/mm/fault.c:show_fault_oops()). But the traces I posted at the
bugzilla.redhat.com issue for this always end with: "Fixing recursive
fault but reboot is needed" (see kernel/exit.c:do_exit()). At that point
the system is still running.

Perhaps you run with panic_on_oops on by default (rumor has it that's an
RHEL default) which might make the result of this Oops surprising.
Anyhow, it turns out that my system is suspiciously happy after the
process(es) causing this Oops has (have) finished. See the big friendly
warning I put on top of the message in which I pasted the output of
Paul's script:

> 1) Big friendly warning: the "CFQ crash" that occurred while running
> your script didn't happen in a clean session. Not at all! It actually
> happened after (summarized a bit):
> - two "CFQ crashes" with the patch for Jens' first idea;
> - switching to deadline
> - removing cfq_iosched
> - recompiling cfq-iosched.ko (to revert Jens' patch)
> - installing cfq_iosched.ko
> - inserting cfq_iosched
> - switching back to cfq again

(Yes, putting "CFQ crash" in quotes there was a bit of legalese on my
part.)


Paul Bolle

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ