[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20150319142749.GE12466@dhcp22.suse.cz>
Date: Thu, 19 Mar 2015 15:27:49 +0100
From: Michal Hocko <mhocko@...e.cz>
To: NeilBrown <neilb@...e.de>
Cc: Andrew Morton <akpm@...ux-foundation.org>,
Al Viro <viro@...iv.linux.org.uk>,
Johannes Weiner <hannes@...xchg.org>,
Mel Gorman <mgorman@...e.de>, Rik van Riel <riel@...hat.com>,
Tetsuo Handa <penguin-kernel@...ove.SAKURA.ne.jp>,
Sage Weil <sage@...tank.com>, Mark Fasheh <mfasheh@...e.com>,
linux-mm@...ck.org, LKML <linux-kernel@...r.kernel.org>
Subject: Re: [PATCH] mm: Use GFP_KERNEL allocation for the page cache in
page_cache_read
On Thu 19-03-15 14:55:58, Michal Hocko wrote:
> On Thu 19-03-15 08:38:35, Neil Brown wrote:
> [...]
> > Nearly half the places in the kernel which call mapping_gfp_mask() remove the
> > __GFP_FS bit.
> >
> > That suggests to me that it might make sense to have
> > mapping_gfp_mask_fs()
> > and
> > mapping_gfp_mask_nofs()
> >
> > and let the presence of __GFP_FS (and __GFP_IO) be determined by the
> > call-site rather than the filesystem.
>
> Sounds reasonable to me but filesystems tend to use this in a very
> different ways.
> - xfs drops GFP_FS in xfs_setup_inode so all page cache allocations are
> NOFS.
> - reiserfs drops GFP_FS only before calling read_mapping_page in
> reiserfs_get_page and never restores the original mask.
> - btrfs doesn't seem to rely on mapping_gfp_mask for anything other than
> btree_inode (unless it gets inherrited in a way I haven't noticed).
> - ext* doesn't seem to rely on the mapping gfp mask at all.
>
> So it is not clear to me how we should change that into callsites. But I
> guess we can change at least the page fault path like the following. I
> like it much more than the previous way which is too hackish.
But this is racy instead... And I do not think we can make it raceless
so scratch this and get back to the original approach.
[...]
> + /*
> + * Some filesystems always drop __GFP_FS to prevent from reclaim
> + * recursion back to FS code. This is not the case here because
> + * we are at the top of the call chain. Add GFP_FS flags to prevent
> + * from premature OOM killer.
> + */
> + mapping_gfp = mapping_gfp_mask(mapping);
> + mapping_set_gfp_mask(mapping, mapping_gfp | __GFP_FS | __GFP_IO);
> ret = vma->vm_ops->fault(vma, &vmf);
> + mapping_set_gfp_mask(mapping, mapping_gfp);
> if (unlikely(ret & (VM_FAULT_ERROR | VM_FAULT_NOPAGE | VM_FAULT_RETRY)))
> return ret;
--
Michal Hocko
SUSE Labs
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
Powered by blists - more mailing lists