[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-Id: <20101018145859.eee1ae33.akpm@linux-foundation.org>
Date: Mon, 18 Oct 2010 14:58:59 -0700
From: Andrew Morton <akpm@...ux-foundation.org>
To: Wu Fengguang <fengguang.wu@...el.com>
Cc: Neil Brown <neilb@...e.de>, Rik van Riel <riel@...hat.com>,
KOSAKI Motohiro <kosaki.motohiro@...fujitsu.com>,
KAMEZAWA Hiroyuki <kamezawa.hiroyu@...fujitsu.com>,
"linux-kernel@...r.kernel.org" <linux-kernel@...r.kernel.org>,
"linux-mm@...ck.org" <linux-mm@...ck.org>,
"Li, Shaohua" <shaohua.li@...el.com>
Subject: Re: Deadlock possibly caused by too_many_isolated.
On Tue, 19 Oct 2010 00:15:04 +0800
Wu Fengguang <fengguang.wu@...el.com> wrote:
> Neil find that if too_many_isolated() returns true while performing
> direct reclaim we can end up waiting for other threads to complete their
> direct reclaim. If those threads are allowed to enter the FS or IO to
> free memory, but this thread is not, then it is possible that those
> threads will be waiting on this thread and so we get a circular
> deadlock.
>
> some task enters direct reclaim with GFP_KERNEL
> => too_many_isolated() false
> => vmscan and run into dirty pages
> => pageout()
> => take some FS lock
> => fs/block code does GFP_NOIO allocation
> => enter direct reclaim again
> => too_many_isolated() true
> => waiting for others to progress, however the other
> tasks may be circular waiting for the FS lock..
>
> The fix is to let !__GFP_IO and !__GFP_FS direct reclaims enjoy higher
> priority than normal ones, by honouring them higher throttle threshold.
>
> Now !GFP_IOFS reclaims won't be waiting for GFP_IOFS reclaims to
> progress. They will be blocked only when there are too many concurrent
> !GFP_IOFS reclaims, however that's very unlikely because the IO-less
> direct reclaims is able to progress much more faster, and they won't
> deadlock each other. The threshold is raised high enough for them, so
> that there can be sufficient parallel progress of !GFP_IOFS reclaims.
I'm not sure that this is really a full fix. Torsten's analysis does
appear to point at the real bug: raid1 has code paths which allocate
more than a single element from a mempool without starting IO against
previous elements.
Giving these allocations the ability to dip further into reserves will
make occurrence of the bug less likely, but if enough threads all do
this at the same time, that reserve will be exhausted and we're back to
square one?
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
Powered by blists - more mailing lists