lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <CA+55aFxxfbCuTjnK_TpxrTftQOXeTi4PBawbv27P_Xqz4Y5ibw@mail.gmail.com>
Date:	Tue, 6 Oct 2015 09:49:19 +0100
From:	Linus Torvalds <torvalds@...ux-foundation.org>
To:	"Eric W. Biederman" <ebiederm@...ssion.com>
Cc:	Michal Hocko <mhocko@...nel.org>,
	Tetsuo Handa <penguin-kernel@...ove.sakura.ne.jp>,
	David Rientjes <rientjes@...gle.com>,
	Oleg Nesterov <oleg@...hat.com>,
	Kyle Walker <kwalker@...hat.com>,
	Christoph Lameter <cl@...ux.com>,
	Andrew Morton <akpm@...ux-foundation.org>,
	Johannes Weiner <hannes@...xchg.org>,
	Vladimir Davydov <vdavydov@...allels.com>,
	linux-mm <linux-mm@...ck.org>,
	Linux Kernel Mailing List <linux-kernel@...r.kernel.org>,
	Stanislav Kozina <skozina@...hat.com>
Subject: Re: can't oom-kill zap the victim's memory?

On Tue, Oct 6, 2015 at 8:55 AM, Eric W. Biederman <ebiederm@...ssion.com> wrote:
>
> Not to take away from your point about very small allocations.  However
> assuming allocations larger than a page will always succeed is down
> right dangerous.

We've required retrying for *at least* order-1 allocations. Exactly
because things like fork() etc have wanted them, and:

 - as you say, you can be unlucky even with reasonable amounts of free memory

 - the page-out code is approximate and doesn't guarantee that you get
buddy coalescing

 - just failing after a couple of loops has been known to result in
fork() and similar friends returning -EAGAIN and breaking user space.

Really. Stop this idiocy. We have gone through this before. It's a disaster.

The basic fact remains: kernel allocations are so important that
rather than fail, you should kill user space. Only kernel allocations
that *explicitly* know that they have fallback code should fail, and
they should just do the __GFP_NORETRY.

So the rule ends up being that we retry the memory freeing loop for
small allocations (where "small" is something like "order 2 or less")

So really. If you find some particular case that is painful because it
wants an order-1 or order-2 allocation, then you do this:

 - do the allocation with GFP_NORETRY

 - have a fallback that uses vmalloc or just is able to make the
buffer even smaller.

But by default we will continue to make small orders retry. As
mentioned, we have tried the alternatives. It doesn't work.

            Linus
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ