lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Date:	Fri, 29 Apr 2011 10:02:36 -0500
From:	James Bottomley <James.Bottomley@...e.de>
To:	Mel Gorman <mgorman@...e.de>
Cc:	Jan Kara <jack@...e.cz>, colin.king@...onical.com,
	Chris Mason <chris.mason@...cle.com>,
	linux-fsdevel <linux-fsdevel@...r.kernel.org>,
	linux-mm <linux-mm@...ck.org>,
	linux-kernel <linux-kernel@...r.kernel.org>,
	linux-ext4 <linux-ext4@...r.kernel.org>, mgorman@...ell.com
Subject: Re: [BUG] fatal hang untarring 90GB file, possibly writeback
 related.

On Thu, 2011-04-28 at 21:27 +0100, Mel Gorman wrote:
> On Thu, Apr 28, 2011 at 02:59:27PM -0500, James Bottomley wrote:
> > On Thu, 2011-04-28 at 20:21 +0100, Mel Gorman wrote:
> > > On Thu, Apr 28, 2011 at 01:30:36PM -0500, James Bottomley wrote:
> > > > > Way hey, cgroups are also in the mix. How jolly.
> > > > > 
> > > > > Is systemd a common element of the machines hitting this bug by any
> > > > > chance?
> > > > 
> > > > Well, yes, the bug report is against FC15, which needs cgroups for
> > > > systemd.
> > > > 
> > > 
> > > Ok although we do not have direct evidence that it's the problem yet. A
> > > broken shrinker could just mean we are also trying to aggressively
> > > reclaim in cgroups.
> > > 
> > > > > The remaining traces seem to be follow-on damage related to the three
> > > > > issues of "shrinkers are bust in some manner" causing "we are not
> > > > > getting over the min watermark" and as a side-show "we are spending lots
> > > > > of time doing something unspecified but unhelpful in cgroups".
> > > > 
> > > > Heh, well find a way for me to verify this: I can't turn off cgroups
> > > > because systemd then won't work and the machine won't boot ...
> > > > 
> > > 
> > > Same testcase, same kernel but a distro that is not using systemd to
> > > verify if cgroups are the problem. Not ideal I know. When I'm back
> > > online Tuesday, I'll try reproducing this on a !Fedora distribution. In
> > > the meantime, the following untested hatchet job might spit out
> > > which shrinker we are getting stuck in. It is also breaking out of
> > > the shrink_slab loop so it'd even be interesting to see if the bug
> > > is mitigated in any way.
> > 
> > Actually, talking to Chris, I think I can get the system up using
> > init=/bin/bash without systemd, so I can try the no cgroup config.
> > 
> > > diff --git a/mm/vmscan.c b/mm/vmscan.c
> > > index c74a501..ed99104 100644
> > 
> > In the mean time, this patch produces:
> > 
> > (that's nothing ... apparently the trace doesn't activate when kswapd
> > goes mad).
> > 
> 
> Or is looping there for shorter than we expect. HZ/10?

Still doesn't print anything, even with HZ/10.

James



--
To unsubscribe from this list: send the line "unsubscribe linux-ext4" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ