lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Date:	Fri, 5 Feb 2016 09:11:46 +0100
From:	Daniel Bilik <daniel.bilik@...system.cz>
To:	Mike Galbraith <umgwanakikbuti@...il.com>
Cc:	Jan Kara <jack@...e.cz>, Thomas Gleixner <tglx@...utronix.de>,
	Tejun Heo <tj@...nel.org>, Michal Hocko <mhocko@...nel.org>,
	Jiri Slaby <jslaby@...e.cz>, Petr Mladek <pmladek@...e.com>,
	Sasha Levin <sasha.levin@...cle.com>, Shaohua Li <shli@...com>,
	LKML <linux-kernel@...r.kernel.org>, stable@...r.kernel.org
Subject: Re: Crashes with 874bbfe600a6 in 3.18.25

On Fri, 05 Feb 2016 03:40:46 +0100
Mike Galbraith <umgwanakikbuti@...il.com> wrote:

> On Thu, 2016-02-04 at 17:39 +0100, Daniel Bilik wrote:
> > On Thu, 4 Feb 2016 12:20:44 +0100
> > Jan Kara <jack@...e.cz> wrote:
> > 
> > > Thanks for backport Thomas and to Mike for persistence :). I've
> > > asked my friend seeing crashes with 3.18.25 to try whether this
> > > patch fixes the issues. It may take some time so stay tuned...
> > 
> > Patch tested and it really fixes the crash we were experiencing on
> > 3.18.25 with commit 874bbfe+. But it seem to introduce (rather scary)
> > regression. Tested host shows abnormal cpu usage in both kernel and
> > userland under the same load and traffic pattern. One picture is worth
> > a thousand words, so I've taken snapshots of our graphs, see here:
> > http://neosystem.cz/test/linux-3.18.25/
> > The host was running 3.18.25 with commit 874bbfe+ (1e7af29+ on
> > 3.18-stable) reverted. With this commit included, it crashed within
> > minutes. Around 13:30 we booted 3.18.25 with commit 874bbfe+ included
> > and with the patch from Thomas. And around 15:40 we've booted the host
> > with previous kernel, just to ensure this abnormal behaviour was
> > really caused by the test kernel.
> > Also interesting, in addition to high cpu usage, there is abnormally
> > high number of zombie processes reported by the system.
> 
> IMHO you should restore the CC list and re-post.  (If I were the
> maintainer of either the workqueue code or 3.18-stable, I'd be highly
> interested in this finding).

Sorry, I haven't realized tha patch proposed by Thomas is already on its
way to stable. CC restored and re-posting.

--
						Daniel Bilik
						neosystem.cz

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ