[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <alpine.DEB.2.00.1008251406260.20253@chino.kir.corp.google.com>
Date: Wed, 25 Aug 2010 14:11:38 -0700 (PDT)
From: David Rientjes <rientjes@...gle.com>
To: Peter Zijlstra <peterz@...radead.org>
cc: Theodore Tso <tytso@....edu>, Jens Axboe <jaxboe@...ionio.com>,
Andrew Morton <akpm@...ux-foundation.org>,
Neil Brown <neilb@...e.de>, Alasdair G Kergon <agk@...hat.com>,
Chris Mason <chris.mason@...cle.com>,
Steven Whitehouse <swhiteho@...hat.com>,
Jan Kara <jack@...e.cz>,
Frederic Weisbecker <fweisbec@...il.com>,
"linux-raid@...r.kernel.org" <linux-raid@...r.kernel.org>,
"linux-btrfs@...r.kernel.org" <linux-btrfs@...r.kernel.org>,
"cluster-devel@...hat.com" <cluster-devel@...hat.com>,
"linux-ext4@...r.kernel.org" <linux-ext4@...r.kernel.org>,
"reiserfs-devel@...r.kernel.org" <reiserfs-devel@...r.kernel.org>,
"linux-kernel@...r.kernel.org" <linux-kernel@...r.kernel.org>
Subject: Re: [patch 1/5] mm: add nofail variants of kmalloc kcalloc and
kzalloc
On Wed, 25 Aug 2010, Peter Zijlstra wrote:
> > The cpusets case is actually the easiest to fix: use GFP_ATOMIC.
>
> I don't think that's a valid usage of GFP_ATOMIC, I think we should
> fallback to outside the cpuset for kernel allocations by default.
Cpusets doesn't enforce isolation for only user memory, it's always bound
_all_ allocations that aren't atomic or in irq context (or oom killed
tasks). Allowing slab, for example, to be allocated in other cpusets
could cause them to oom themselves since they are bound by the same memory
isolation policy that all other cpusets are. We'd get random oom
conditions in cpusets only depending on where the slab was allocated at
now fault to those applications themselves, and that's certainly not a
situation we want. The memory controller cgroup also has slab accounting
on their TODO list.
If you think GFP_ATOMIC is inappropriate in these contexts, then they are
by definition blockable. So this seems like a good candidate for using
memory compaction since we're talking only about PAGE_ALLOC_COSTLY_ORDER
and higher allocs, even though it's only currently configurable for
hugepages.
There's still no hard guarantee that the memory will allocatable
(GFP_KERNEL, the compaction, then GFP_ATOMIC could all still fail), but I
don't see how continuously looping the page allocator is possibly supposed
to help in these situations.
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
Powered by blists - more mailing lists