[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <CA+55aFxxfbCuTjnK_TpxrTftQOXeTi4PBawbv27P_Xqz4Y5ibw@mail.gmail.com>
Date: Tue, 6 Oct 2015 09:49:19 +0100
From: Linus Torvalds <torvalds@...ux-foundation.org>
To: "Eric W. Biederman" <ebiederm@...ssion.com>
Cc: Michal Hocko <mhocko@...nel.org>,
Tetsuo Handa <penguin-kernel@...ove.sakura.ne.jp>,
David Rientjes <rientjes@...gle.com>,
Oleg Nesterov <oleg@...hat.com>,
Kyle Walker <kwalker@...hat.com>,
Christoph Lameter <cl@...ux.com>,
Andrew Morton <akpm@...ux-foundation.org>,
Johannes Weiner <hannes@...xchg.org>,
Vladimir Davydov <vdavydov@...allels.com>,
linux-mm <linux-mm@...ck.org>,
Linux Kernel Mailing List <linux-kernel@...r.kernel.org>,
Stanislav Kozina <skozina@...hat.com>
Subject: Re: can't oom-kill zap the victim's memory?
On Tue, Oct 6, 2015 at 8:55 AM, Eric W. Biederman <ebiederm@...ssion.com> wrote:
>
> Not to take away from your point about very small allocations. However
> assuming allocations larger than a page will always succeed is down
> right dangerous.
We've required retrying for *at least* order-1 allocations. Exactly
because things like fork() etc have wanted them, and:
- as you say, you can be unlucky even with reasonable amounts of free memory
- the page-out code is approximate and doesn't guarantee that you get
buddy coalescing
- just failing after a couple of loops has been known to result in
fork() and similar friends returning -EAGAIN and breaking user space.
Really. Stop this idiocy. We have gone through this before. It's a disaster.
The basic fact remains: kernel allocations are so important that
rather than fail, you should kill user space. Only kernel allocations
that *explicitly* know that they have fallback code should fail, and
they should just do the __GFP_NORETRY.
So the rule ends up being that we retry the memory freeing loop for
small allocations (where "small" is something like "order 2 or less")
So really. If you find some particular case that is painful because it
wants an order-1 or order-2 allocation, then you do this:
- do the allocation with GFP_NORETRY
- have a fallback that uses vmalloc or just is able to make the
buffer even smaller.
But by default we will continue to make small orders retry. As
mentioned, we have tried the alternatives. It doesn't work.
Linus
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
Powered by blists - more mailing lists