lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20110803190623.GA5873@redhat.com>
Date:	Wed, 3 Aug 2011 21:06:23 +0200
From:	Johannes Weiner <jweiner@...hat.com>
To:	Minchan Kim <minchan.kim@...il.com>
Cc:	Andi Kleen <ak@...ux.intel.com>, linux-mm@...ck.org,
	Dave Chinner <david@...morbit.com>,
	Christoph Hellwig <hch@...radead.org>,
	Mel Gorman <mgorman@...e.de>,
	Andrew Morton <akpm@...ux-foundation.org>,
	Wu Fengguang <fengguang.wu@...el.com>,
	Rik van Riel <riel@...hat.com>, Jan Kara <jack@...e.cz>,
	linux-kernel@...r.kernel.org
Subject: Re: [patch 4/5] mm: writeback: throttle __GFP_WRITE on per-zone
 dirty limits

On Tue, Jul 26, 2011 at 08:40:59AM +0900, Minchan Kim wrote:
> Hi Andi,
> 
> On Tue, Jul 26, 2011 at 5:37 AM, Andi Kleen <ak@...ux.intel.com> wrote:
> >> The global dirty limits are put in proportion to the respective zone's
> >> amount of dirtyable memory and the allocation denied when the limit of
> >> that zone is reached.
> >>
> >> Before the allocation fails, the allocator slowpath has a stage before
> >> compaction and reclaim, where the flusher threads are kicked and the
> >> allocator ultimately has to wait for writeback if still none of the
> >> zones has become eligible for allocation again in the meantime.
> >>
> >
> > I don't really like this. It seems wrong to make memory
> > placement depend on dirtyness.
> >
> > Just try to explain it to some system administrator or tuner: her
> > head will explode and for good reasons.
> >
> > On the other hand I like doing round-robin in filemap by default
> > (I think that is what your patch essentially does)
> > We should have made  this default long ago. It avoids most of the
> > "IO fills up local node" problems people run into all the time.
> >
> > So I would rather just change the default in filemap allocation.

It's not only a problem that exists solely on a node-level but also on
a zone-level.  Round-robin over the nodes does not fix the problem
that a small zone can fill up with dirty pages before the global dirty
limit kicks in.

> Just out of curiosity.
> Why do you want to consider only filemap allocation, not IO(ie,
> filemap + sys_[read/write]) allocation?

I guess Andi was referring to the page cache (mapping file offsets to
pages), rather than mmaps (mapping virtual addresses to pages).

mm/filemap.c::__page_cache_alloc()
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ