[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20150701111355.GA22423@thunk.org>
Date: Wed, 1 Jul 2015 07:13:55 -0400
From: Theodore Ts'o <tytso@....edu>
To: Michal Hocko <mhocko@...e.cz>
Cc: Dave Chinner <david@...morbit.com>,
Nikolay Borisov <kernel@...p.com>, linux-ext4@...r.kernel.org,
Marian Marinov <mm@...com>
Subject: Re: Lockup in wait_transaction_locked under memory pressure
On Wed, Jul 01, 2015 at 08:10:14AM +0200, Michal Hocko wrote:
> On Wed 01-07-15 08:58:51, Dave Chinner wrote:
> [...]
> > *blink*
> >
> > /me re-reads again
> >
> > That assumption is fundamentally broken. Filesystems use GFP_NOFS
> > because the filesystem holds resources that can prevent memory
> > reclaim making forwards progress if it re-enters the filesystem or
> > blocks on anything filesystem related. memcg does not change that,
> > and I'm kinda scared to learn that memcg plays fast and loose like
> > this.
> >
> > For example: IO completion might require unwritten extent conversion
> > which executes filesystem transactions and GFP_NOFS allocations. The
> > writeback flag on the pages can not be cleared until unwritten
> > extent conversion completes. Hence memory reclaim cannot wait on
> > page writeback to complete in GFP_NOFS context because it is not
> > safe to do so, memcg reclaim or otherwise.
>
> Thanks for the clarification.
Perhaps we need to make the documentation a bit more explicit?
All which is stated in include/slab.h:
* %GFP_NOIO - Do not do any I/O at all while trying to get memory.
*
* %GFP_NOFS - Do not make any fs calls while trying to get memory.
I thought this was obvious, but these flags are used by code which in
the I/O or FS paths, and so it's always possible that they are trying
to write back the page which you decide to blocking on when trying to
do the memory allocation, at which point, *boom*, deadlock.
So it's just not "do not make any FS or I/O calls", but also "the mm
layer must not not wait for any FS or I/O operations from completing,
since the operation you block on might be the one they were in the
middle of trying to complete --- or they may be holding a lock at the
time when they were trying to do a memory allocation which blocks the
I/O or FS operation from completing".
- Ted
--
To unsubscribe from this list: send the line "unsubscribe linux-ext4" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Powered by blists - more mailing lists