lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-Id: <20070921012248.3d2e9cd9.akpm@linux-foundation.org>
Date:	Fri, 21 Sep 2007 01:22:48 -0700
From:	Andrew Morton <akpm@...ux-foundation.org>
To:	Matthias Hensler <matthias@...se.de>
Cc:	Chuck Ebbert <cebbert@...hat.com>,
	linux-kernel <linux-kernel@...r.kernel.org>,
	Thomas Gleixner <tglx@...utronix.de>,
	richard kennedy <richard@....demon.co.uk>,
	Peter Zijlstra <a.p.zijlstra@...llo.nl>
Subject: Re: Processes spinning forever, apparently in lock_timer_base()?

On Fri, 21 Sep 2007 10:08:08 +0200 Matthias Hensler <matthias@...se.de> wrote:

> On Thu, Sep 20, 2007 at 03:36:54PM -0700, Andrew Morton wrote:
> > That's all a bit crappy if the wrong races happen and some other task
> > is somehow exceeding the dirty limits each time this task polls them.
> > Seems unlikely that such a condition would persist forever.
> 
> How exactly do you define forever? It looks to me, that this condition
> never resolves on its own, at least not in a window of several hours
> (the system used to get stuck around 3am and was normally rebooted
> between 7am and 8am, so at least hours are not enough to have the problem
> resolve on its own).

That's forever.

> > So the question is, why do we have large amounts of dirty pages for
> > one disk which appear to be sitting there not getting written?
> 
> Unfortunately I have no idea. The full stacktrace for all processes was
> attached to the original bugreport, maybe that can give a clue to that.
> 
> > Do we know if there's any writeout at all happening when the system is
> > in this state?
> 
> Not sure about that. The system is responsible on a still open SSH
> session, allowing several tasks still to be executed. However, that SSH
> session gets stuck very fast if the wrong commands are executed. New
> logins are not possible (Connection is akzepted but resulting in a
> timeout 60 seconds later, so most likely /bin/login is not able to log
> into wtmp).
> 
> From all that I suspect that there is no more write activity on that
> system, but cannot say for sure.

Easiest would be to run `vmstat 1' in a separate ssh session then just
leave it running.

> > Did anyone try running /bin/sync when the system is in this state?
> 
> I did not, no.

Would be interesting if poss, please.
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ