lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-Id: <20120214152220.4f621975.akpm@linux-foundation.org>
Date:	Tue, 14 Feb 2012 15:22:20 -0800
From:	Andrew Morton <akpm@...ux-foundation.org>
To:	Andrea Righi <andrea@...terlinux.com>
Cc:	Minchan Kim <minchan.kim@...il.com>,
	Peter Zijlstra <a.p.zijlstra@...llo.nl>,
	Johannes Weiner <jweiner@...hat.com>,
	KAMEZAWA Hiroyuki <kamezawa.hiroyu@...fujitsu.com>,
	KOSAKI Motohiro <kosaki.motohiro@...fujitsu.com>,
	Rik van Riel <riel@...hat.com>,
	Hugh Dickins <hughd@...gle.com>,
	Alexander Viro <viro@...iv.linux.org.uk>,
	Shaohua Li <shaohua.li@...el.com>,
	Pádraig Brady <P@...igBrady.com>,
	John Stultz <john.stultz@...aro.org>,
	Jerry James <jamesjer@...terlinux.com>,
	Julius Plenz <julius@...nz.com>, linux-mm <linux-mm@...ck.org>,
	linux-fsdevel@...r.kernel.org, LKML <linux-kernel@...r.kernel.org>
Subject: Re: [RFC] [PATCH v5 0/3] fadvise: support POSIX_FADV_NOREUSE

On Tue, 14 Feb 2012 23:59:22 +0100
Andrea Righi <andrea@...terlinux.com> wrote:

> On Tue, Feb 14, 2012 at 01:33:37PM -0800, Andrew Morton wrote:
> > On Sun, 12 Feb 2012 01:21:35 +0100
> > Andrea Righi <andrea@...terlinux.com> wrote:
> > 
> > > The new proposal is to implement POSIX_FADV_NOREUSE as a way to perform a real
> > > drop-behind policy where applications can mark certain intervals of a file as
> > > FADV_NOREUSE before accessing the data.
> > 
> > I think you and John need to talk to each other, please.  The amount of
> > duplication here is extraordinary.
> 
> Yes, definitely. I'm currently reviewing and testing the John's patch
> set. I was even considering to apply my patch set on top of the John's
> patch, or at least propose my tree-based approach to manage the list of
> the POSIX_FADV_VOLATILE ranges.

Cool.

> > 
> > Both patchsets add fields to the address_space (and hence inode), which
> > is significant - we should convince ourselves that we're getting really
> > good returns from a feature which does this.
> > 
> > 
> > 
> > Regarding the use of fadvise(): I suppose it's a reasonable thing to do
> > in the long term - if the feature works well, popular data streaming
> > applications will eventually switch over.  But I do think we should
> > explore interfaces which don't require modification of userspace source
> > code.  Because there will always be unconverted applications, and the
> > feature becomes available immediately.
> > 
> > One such interface would be to toss the offending application into a
> > container which has a modified drop-behind policy.  And here we need to
> > drag out the crystal ball: what *is* the best way of tuning application
> > pagecache behaviour?  Will we gravitate towards containerization, or
> > will we gravitate towards finer-tuned fadvise/sync_page_range/etc
> > behaviour?  Thus far it has been the latter, and I don't think that has
> > been a great success.
> > 
> > Finally, are the problems which prompted these patchsets already
> > solved?  What happens if you take the offending streaming application
> > and toss it into a 16MB memcg?  That *should* avoid perturbing other
> > things running on that machine.
> 
> Moving the streaming application into a 16MB memcg can be dangerous in
> some cases... the application might start to do "bad" things, like
> swapping (if the memcg can swap) or just fail due to OOMs.

Well OK, maybe there are problems with the current implementation.  But
are they unfixable problems?  Is the right approach to give up on ever
making containers useful for this application and to instead go off and
implement a new and separate feature?

> > And yes, a container-based approach is pretty crude, and one can
> > envision applications which only want modified reclaim policy for one
> > particualr file.  But I suspect an application-wide reclaim policy
> > solves 90% of the problems.
> 
> I really like the container-based approach. But for this we need a
> better file cache control in the memory cgroup; now we have the
> accounting of file pages, but there's no way to limit them.

Again, if/whem memcg becomes sufficiently useful for this application
we're left maintaining the obsolete POSIX_FADVISE_NOREUSE for ever.

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ