lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20110317171219.GD32392@redhat.com>
Date:	Thu, 17 Mar 2011 13:12:19 -0400
From:	Vivek Goyal <vgoyal@...hat.com>
To:	Jan Kara <jack@...e.cz>
Cc:	Greg Thelen <gthelen@...gle.com>,
	Johannes Weiner <hannes@...xchg.org>,
	Andrew Morton <akpm@...ux-foundation.org>,
	linux-kernel@...r.kernel.org, linux-mm@...ck.org,
	containers@...ts.osdl.org, linux-fsdevel@...r.kernel.org,
	Andrea Righi <arighi@...eler.com>,
	Balbir Singh <balbir@...ux.vnet.ibm.com>,
	KAMEZAWA Hiroyuki <kamezawa.hiroyu@...fujitsu.com>,
	Daisuke Nishimura <nishimura@....nes.nec.co.jp>,
	Minchan Kim <minchan.kim@...il.com>,
	Ciju Rajan K <ciju@...ux.vnet.ibm.com>,
	David Rientjes <rientjes@...gle.com>,
	Wu Fengguang <fengguang.wu@...el.com>,
	Chad Talbott <ctalbott@...gle.com>,
	Justin TerAvest <teravest@...gle.com>,
	Curt Wohlgemuth <curtw@...gle.com>
Subject: Re: [PATCH v6 0/9] memcg: per cgroup dirty page accounting

On Thu, Mar 17, 2011 at 03:46:41PM +0100, Jan Kara wrote:

[..]
> > - bdi writeback: will revert some of the mmotm memcg dirty limit changes to
> >   fs-writeback.c so that wb_do_writeback() will return to checking
> >   wb_check_background_flush() to check background limits and being
> > interruptible if
> >   sync flush occurs.  wb_check_background_flush() will check the global
> >   memcg_over_bg_limit list for memcg that are over their dirty limit.
> >   wb_writeback() will either (I am not sure):
> >   a) scan memcg's bdi_memcg list of inodes (only some of them are dirty)
> >   b) scan bdi dirty inode list (only some of them in memcg) using
> >      inode_in_memcg() to identify inodes to write.  inode_in_memcg(inode,memcg),
> >      would walk memcg- -> memcg_bdi -> memcg_mapping to determine if the memcg
> >      is caching pages from the inode.
> Hmm, both has its problems. With a) we could queue all the dirty inodes
> from the memcg for writeback but then we'd essentially write all dirty data
> for a memcg, not only enough data to get below bg limit. And if we started
> skipping inodes when memcg(s) inode belongs to get below bg limit, we'd
> risk copying inodes there and back without reason, cases where some inodes
> never get written because they always end up skipped etc. Also the question
> whether some of the memcgs inode belongs to is still over limit is the
> hardest part of solution b) so we wouldn't help ourselves much.

May be I am missing something but can't we just start traversing
through list of memcg_over_bg_list and take option a) to traverse
through list of inodes and write them till we are below limit of
that group. We of course skip inodes which are not dirty.

This is assuming that root group is also part of that list so that
inodes in root group do not starve writeback.

We still continue to have all the inodes on bdi wb structure and
memcg will just give us pointers to those inodes. So for background
write, instead of going serially through dirty inodes list, we
will first pick the cgroup to write and then inode to write. As
we will be doing round robin among cgroup list, it will make sure
that none of the cgroups (including root) as well as inode are not
starved.

What am I missing?

Thanks
Vivek 
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ