[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20081029032557.GA17624@wotan.suse.de>
Date: Wed, 29 Oct 2008 04:25:57 +0100
From: Nick Piggin <npiggin@...e.de>
To: Theodore Tso <tytso@....edu>, Mike Snitzer <snitzer@...il.com>,
linux-fsdevel@...r.kernel.org, linux-ext4@...r.kernel.org,
linux-kernel@...r.kernel.org, Kirill Korotaev <dev@...nvz.org>
Subject: Re: potential regression in ext[34] call to __page_symlink()?
On Tue, Oct 28, 2008 at 10:40:48PM -0400, Theodore Tso wrote:
> On Tue, Oct 28, 2008 at 08:11:48PM -0400, Mike Snitzer wrote:
> > The gfp_mask that is passed to __page_symlink() is being completely
> > dropped on the floor. Historically this mask was at least used by
> > ext3 and ext4 to avoid recursing back into the FS from within a
> > journal transaction; Kirill fixed that issue with this commit:
> > 0adb25d2e71ab047423d6fc63d5d184590d0a66f
> >
> > I'm quite naive when it comes to Nick's relatively new (>= 2.6.24) AOP
> > pagecache_write_{begin,end} code that motivated __page_symlink to
> > change with this commit:
> > afddba49d18f346e5cc2938b6ed7c512db18ca68
> >
> > Nick's change clearly did away with using the explicitly passed
> > gfp_mask in __page_symlink().
> > So at a minimum it would seem __page_symlink() now has an unused
> > parameter that should be removed.
> >
> > But a more serious concern would be: have ext[34]_symlink() regressed
> > to being susceptible to the bug that Kirill fixed some time ago?
>
> Yeah, I think this would be a potential problem for ext3/4. Looks
> like pagemap_write_begin() should take a gfp_mask argument, and then
> pass it down through to __grab_cache_page(), which should then call
> __page_cache_alloc() instead of _page_cache_alloc(). Then
> __page_symlink() can actually pass in its gfp_mask to
> pagemap_write_begin().
>
> Nick, do you agree?
I agree it is a problem. It's a bit hard to pass down a gfp_mask
(because the caller would normally expect _all_ operations in the
called code to obey the mask, basically impossible to do for
GFP_NOFS because by definition we're calling into ->write_begin).
I was leaning towards adding a new AOP_FLAG_ there, usable just by
filesystem code, and just to tell any helper code to clear __GFP_FS.
That way callers won't get confused into thinking they can do
GFP_ATOMIC writes from interrupt context or something ;) (which,
trust me, somebody will attempt to do if it looks remotely feasible!)
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
Powered by blists - more mailing lists