lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <2357788.X5UHX7WJZF@xorhgos3.pefnos>
Date:	Tue, 04 Nov 2014 11:36:15 +0200
From:	"P. Christeas" <xrg@...ux.gr>
To:	Vlastimil Babka <vbabka@...e.cz>
Cc:	linux-mm@...ck.org, Joonsoo Kim <iamjoonsoo.kim@....com>,
	lkml <linux-kernel@...r.kernel.org>
Subject: Re: Early test: hangs in mm/compact.c w. Linus's 12d7aacab56e9ef185c

On Tuesday 04 November 2014, Vlastimil Babka wrote:
> Please do keep testing (and see below what we need), and don't try
> another tree - it's 3.18 we need to fix!
Let me apologize/warn you about the poor quality of this report (and debug 
data).
It is on a system meant for everyday desktop usage, not kernel development. 
Thus, it is tuned to be "slightly" debuggable ; mostly for performance.

> I'm not sure what you mean by "race" here and your snippet is
> unfortunately just a small portion of the output ...

It is a shot in the dark. System becomes non-responsive (narrowed to desktop 
apps waiting each other, or the X+kwin blocking), I can feel the CPU heating 
and /sometimes/ disk I/O.

No BUG, Oops or any kernel message. (is printk level 4 adequate? )

Then, I try to drop to a console and collect as much data as possible with 
SysRq.

The snippet I'd sent you is from all-cpus-backtrace (l), trying to see which 
traces appear consistently during the lockup. There is also the huge traces of 
"task-states" (t), but I reckon they are too noisy.
That trace also matches the usage profile, because AFAICG[uess] the issue 
appears when allocating during I/O load. 

After turning on full-preemption, I have been able to terminate/kill all tasks 
and continue with same kernel but new userspace.

> OK so the process is not dead due to the problem? That probably rules
> out some kinds of errors but we still need the full output. Thanks in
> advance. 
> I'm not aware of this, CCing lkml for wider coverage.

Thank you. As I've told in the first mail, this is an early report of possible 
3.18 regression. I'm trying to narrow down the case and make it reproducible 
or get a good trace.

Attached is my current .config



Download attachment "config-3.18.gz" of type "application/gzip" (35515 bytes)

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ