lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Date:	Tue, 15 Jun 2010 17:45:16 +0200
From:	Andrea Arcangeli <aarcange@...hat.com>
To:	Christoph Hellwig <hch@...radead.org>
Cc:	Mel Gorman <mel@....ul.ie>, linux-kernel@...r.kernel.org,
	linux-fsdevel@...r.kernel.org, linux-mm@...ck.org,
	Dave Chinner <david@...morbit.com>,
	Chris Mason <chris.mason@...cle.com>,
	Nick Piggin <npiggin@...e.de>, Rik van Riel <riel@...hat.com>
Subject: Re: [RFC PATCH 0/6] Do not call ->writepage[s] from direct reclaim
 and use a_ops->writepages() where possible

On Tue, Jun 15, 2010 at 11:25:26AM -0400, Christoph Hellwig wrote:
> hand can happen from context that already is say 4 or 6 kilobytes
> into stack usage.  And the callchain from kmalloc() into ->writepage

Mel's stack trace of 5k was still not realistic as it doesn't call
writepage there. I was just asking the 6k example vs msync.

Plus shrink dcache/inodes may also invoke I/O and end up with all
those hogs.

> I've never seen the stack overflow detector trigger on this, but I've
> seen lots of real life stack overflows on the mailing lists.  End
> users don't run with it enabled normally, and most testing workloads
> don't seem to hit direct reclaim enough to actually trigger this
> reproducibly.

How do you know it's a stack overflow if it's not the stack overflow
detector firing before the fact, could be bad ram too, usually?

> Which is a lot more complicated than loading off the page cleaning
> from direct reclaim to dedicated threads - be that the flusher threads
> or kswapd.

More complicated for sure. But surely I like that more than vetoing
->writepage from VM context, especially if it's a fs decision. fs
shouldn't decide that.

> It allows the system to survive in case direct reclaim is called instead
> of crashing with a stack overflow.  And at least in my testing the
> VM seems to cope rather well with not beeing able to write out
> filesystem pages from direct reclaim.  That doesn't mean that this
> behaviour can't be further improved on.

Agreed. Surely it seems to work ok for me too, but it may hide VM
issues, it makes the VM less reliable against potential false positive
OOM, and it's better if we just teach the VM to switch stack before
invoking the freeing methods, so it automatically solves dcache/icache
collection ending up writing data etc...

Then if we don't want to call ->writepage we won't do it for other
reasons, but we can solve this in a generic and reliable way that
covers not just ->writepage but all source I/O, including swapout over
iscsi, vfs etc...
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ