linux-kernel - Re: [patch 0/7] cpuset writeback throttling

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <Pine.LNX.4.64.0811051415360.31450@quilx.com>
Date:	Wed, 5 Nov 2008 14:21:47 -0600 (CST)
From:	Christoph Lameter <cl@...ux-foundation.org>
To:	Andrew Morton <akpm@...ux-foundation.org>
cc:	peterz@...radead.org, rientjes@...gle.com, npiggin@...e.de,
	menage@...gle.com, dfults@....com, linux-kernel@...r.kernel.org,
	containers@...ts.osdl.org
Subject: Re: [patch 0/7] cpuset writeback throttling

On Wed, 5 Nov 2008, Andrew Morton wrote:

> > > Doable.  lru->page->mapping->host is a good start.
> >
> > The block layer has a list of inodes that are dirty. From that we need to
> > select ones that will improve the situation from the cpuset/memcg. How
> > does the LRU come into this?
>
> In the simplest case, dirty-memory throttling can just walk the LRU
> writing back pages in the same way that kswapd does.

That means running reclaim. But we are only interested in getting rid of
dirty pages. Plus the filesystem guys have repeatedly pointed out that
page sized I/O to random places in a file is not a good thing to do. There
was actually talk of stopping kswapd from writing out pages!

> There would probably be performance benefits in doing
> address_space-ordered writeback, so the dirty-memory throttling could
> pick a dirty page off the LRU, go find its inode and then feed that
> into __sync_single_inode().

We cannot call into the writeback functions for an inode from a reclaim
context. We can write back single pages but not a range of pages from an
inode due to various locking issues (see discussion on slab defrag
patchset).

> > How do I get to the LRU from the dirtied list of inodes?
>
> Don't need it.
>
> It'll be approximate and has obvious scenarios of great inaccuraracy
> but it'll suffice for the workloads which this patchset addresses.

Sounds like a wild hack that runs against known limitations in terms
of locking etc.

> It sounds like any memcg-based approach just won't be suitable for the
> people who are hitting this problem.

Why not? If you can determine which memcgs an inode has dirty pages on
then the same scheme as proposed here will work.

> But _are_ people hitting this problem?  I haven't seen any real-looking
> reports in ages.  Is there some workaround?  If so, what is it?  How
> serious is this problem now?

Are there people who are actually having memcg based solutions deployed?
No enterprise release includes it yet so I guess that there is not much of
a use yet.
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/