linux-kernel - Re: [PATCH 0/7] Reduce filesystem writeback from page reclaim v3

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <20110830131915.GB24629@csn.ul.ie>
Date:	Tue, 30 Aug 2011 14:19:15 +0100
From:	Mel Gorman <mgorman@...e.de>
To:	Andrew Morton <akpm@...ux-foundation.org>
Cc:	Linux-MM <linux-mm@...ck.org>, LKML <linux-kernel@...r.kernel.org>,
	XFS <xfs@....sgi.com>, Dave Chinner <david@...morbit.com>,
	Christoph Hellwig <hch@...radead.org>,
	Johannes Weiner <jweiner@...hat.com>,
	Wu Fengguang <fengguang.wu@...el.com>, Jan Kara <jack@...e.cz>,
	Rik van Riel <riel@...hat.com>,
	Minchan Kim <minchan.kim@...il.com>
Subject: Re: [PATCH 0/7] Reduce filesystem writeback from page reclaim v3

On Thu, Aug 18, 2011 at 04:54:20PM -0700, Andrew Morton wrote:
> On Wed, 10 Aug 2011 11:47:13 +0100
> Mel Gorman <mgorman@...e.de> wrote:
> 
> > The new problem is that
> > reclaim has very little control over how long before a page in a
> > particular zone or container is cleaned which is discussed later.
> 
> Confused - where was this discussed?  Please tell us more about
> this problem and how it was addressed.
> 

This text really referred to V2 of the series where kswapd was not
writing back pages. This lead to problems on NUMA as described in
https://lkml.org/lkml/2011/7/21/242 . I should have updated the text to
read

"There is a potential new problem as reclaim has less control over
how long before a page in a particularly zone or container is cleaned
and direct reclaimers depend on kswapd or flusher threads to do
the necessary work. However, as filesystems sometimes ignore direct
reclaim requests already, it is not expected to be a serious issue"

> Another (and somewhat interrelated) potential problem I see with this
> work is that it throws a big dependency onto kswapd.  If kswapd gets
> stuck somewhere for extended periods, there's nothing there to perform
> direct writeback. 

In theory, this is true. In practice, btrfs and ext4 are already
ignoring requests from direct reclaim and have been for some
time. btrfs is particularly bad in that is also ignores requests
from kswapd leading me to believe that we are eventually going to
see stall-related bug reports on large NUMA machines with btrfs.

> This has happened in the past in weird situations
> such as kswpad getting blocked on ext3 journal commits which are
> themselves stuck for ages behind lots of writeout which itself is stuck
> behind lots of reads.  That's an advantage of direct reclaim: more
> threads available.

I do not know what these situations were but was it possible that it was
due to too many direct reclaimers starving kswapd of access to the
journal?

> How forcefully has this stuff been tested with multiple disks per
> kswapd? 

As heavily as I could on the machine I had available. This was 4 disks
for one kswapd instance. I did not spot major problems.

> Where one disk is overloaded-ext3-on-usb-stick?
> 

I tested with ext4 on a USB stick, not ext3. It completed faster and the
interactive performance felt roughly the same.

-- 
Mel Gorman
SUSE Labs
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/