[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <CALCETrXDbkotCZ1WEbhNeYGt0zyKT42agzsFxT2SZJ4wadEnQA@mail.gmail.com>
Date: Wed, 11 Jan 2017 11:20:38 -0800
From: Andy Lutomirski <luto@...capital.net>
To: Dave Hansen <dave.hansen@...el.com>
Cc: "Kirill A. Shutemov" <kirill@...temov.name>,
"Kirill A. Shutemov" <kirill.shutemov@...ux.intel.com>,
Linus Torvalds <torvalds@...ux-foundation.org>,
Andrew Morton <akpm@...ux-foundation.org>,
X86 ML <x86@...nel.org>, Thomas Gleixner <tglx@...utronix.de>,
Ingo Molnar <mingo@...hat.com>, Arnd Bergmann <arnd@...db.de>,
"H. Peter Anvin" <hpa@...or.com>, Andi Kleen <ak@...ux.intel.com>,
linux-arch <linux-arch@...r.kernel.org>,
"linux-mm@...ck.org" <linux-mm@...ck.org>,
"linux-kernel@...r.kernel.org" <linux-kernel@...r.kernel.org>,
Linux API <linux-api@...r.kernel.org>
Subject: Re: [RFC, PATCHv2 29/29] mm, x86: introduce RLIMIT_VADDR
On Wed, Jan 11, 2017 at 10:49 AM, Dave Hansen <dave.hansen@...el.com> wrote:
> On 01/11/2017 10:37 AM, Kirill A. Shutemov wrote:
>>> How about preventing the max addr from being changed to too high a
>>> value while MPX is on instead of overriding the set value? This would
>>> have the added benefit that it would prevent silent failures where you
>>> think you've enabled large addresses but MPX is also on and mmap
>>> refuses to return large addresses.
>> Setting rlimit high doesn't mean that you necessary will get access to
>> full address space, even without MPX in picture. TASK_SIZE limits the
>> available address space too.
>
> OK, sure... If you want to take another mechanism into account with
> respect to MPX, we can do that. We'd just need to change every
> mechanism we want to support to ensure that it can't transition in ways
> that break MPX.
>
> What are you arguing here, though? Since we *might* be limited by
> something else that we should not care about controlling the rlimit?
>
>> I think it's consistent with other resources in rlimit: setting RLIMIT_RSS
>> to unlimited doesn't really means you are not subject to other resource
>> management.
>
> The farther we get into this, the more and more I think using an rlimit
> is a horrible idea. Its semantics aren't a great match, and you seem to
> be resistant to making *this* rlimit differ from the others when there's
> an entirely need to do so. We're already being bitten by "legacy"
> rlimit. IOW, being consistent with *other* rlimit behavior buys us
> nothing, only complexity.
Taking a step back, I think it would be fantastic if we could find a
way to make this work without any inheritable settings at all.
Perhaps we could have a per-mm value that is initialized to 2^47-1 on
execve() and can be raised by ELF note or by prctl()? Getting it
right for 32-bit would require a bit of thought. The ELF note would
make a high stack possible and, without the ELF note, we'd get a low
stack but high mmap(). Then the messy bits can be glibc's problem and
a toolchain problem as it should be, given that the only reason we
need a limit at all is because of messy userspace code.
Sure, the low stack prevents the *whole* address space from being used
in one big block for databases, but 2^57 - 2^47 ought to be good
enough.
I'm not 100% sure this is workable but, if it is, it makes everyone's
life easier. There's no need to muck around with setarch(1) or
similar hacks.
Powered by blists - more mailing lists