lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <CALCETrV_qejd-Ozqo4vTqz=LuukMUPeQ7EVUQbfTxs_xNbO3oQ@mail.gmail.com>
Date:   Mon, 2 Jan 2017 22:08:28 -0800
From:   Andy Lutomirski <luto@...capital.net>
To:     Arnd Bergmann <arnd@...db.de>
Cc:     "Kirill A. Shutemov" <kirill.shutemov@...ux.intel.com>,
        Linus Torvalds <torvalds@...ux-foundation.org>,
        Andrew Morton <akpm@...ux-foundation.org>,
        X86 ML <x86@...nel.org>, Thomas Gleixner <tglx@...utronix.de>,
        Ingo Molnar <mingo@...hat.com>,
        "H. Peter Anvin" <hpa@...or.com>, Andi Kleen <ak@...ux.intel.com>,
        Dave Hansen <dave.hansen@...el.com>,
        linux-arch <linux-arch@...r.kernel.org>,
        "linux-mm@...ck.org" <linux-mm@...ck.org>,
        "linux-kernel@...r.kernel.org" <linux-kernel@...r.kernel.org>,
        Linux API <linux-api@...r.kernel.org>,
        "linux-arm-kernel@...ts.infradead.org" 
        <linux-arm-kernel@...ts.infradead.org>,
        Catalin Marinas <catalin.marinas@....com>,
        Will Deacon <will.deacon@....com>
Subject: Re: [RFC, PATCHv2 29/29] mm, x86: introduce RLIMIT_VADDR

On Mon, Jan 2, 2017 at 12:44 AM, Arnd Bergmann <arnd@...db.de> wrote:
> On Tuesday, December 27, 2016 4:54:13 AM CET Kirill A. Shutemov wrote:
>> As with other resources you can set the limit lower than current usage.
>> It would affect only future virtual address space allocations.

I still don't buy all these use cases:

>>
>> Use-cases for new rlimit:
>>
>>   - Bumping the soft limit to RLIM_INFINITY, allows current process all
>>     its children to use addresses above 47-bits.

OK, I get this, but only as a workaround for programs that make
assumptions about the address space and don't use some mechanism (to
be designed?) to work correctly in spite of a larger address space.

>>
>>   - Bumping the soft limit to RLIM_INFINITY after fork(2), but before
>>     exec(2) allows the child to use addresses above 47-bits.

Ditto.

>>
>>   - Lowering the hard limit to 47-bits would prevent current process all
>>     its children to use addresses above 47-bits, unless a process has
>>     CAP_SYS_RESOURCES.

I've tried and I can't imagine any reason to do this.

>>
>>   - It’s also can be handy to lower hard or soft limit to arbitrary
>>     address. User-mode emulation in QEMU may lower the limit to 32-bit
>>     to emulate 32-bit machine on 64-bit host.

I don't understand.  QEMU user-mode emulation intercepts all syscalls.
What QEMU would *actually* want is a way to say "allocate me some
memory with the high N bits clear".  mmap-via-int80 on x86 should be
fixed to do this, but a new syscall with an explicit parameter would
work, as would a prctl changing the current limit.

>>
>> TODO:
>>   - port to non-x86;
>>
>> Not-yet-signed-off-by: Kirill A. Shutemov <kirill.shutemov@...ux.intel.com>
>> Cc: linux-api@...r.kernel.org
>
> This seems to nicely address the same problem on arm64, which has
> run into the same issue due to the various page table formats
> that can currently be chosen at compile time.

On further reflection, I think this has very little to do with paging
formats except insofar as paging formats make us notice the problem.
The issue is that user code wants to be able to assume an upper limit
on an address, and it gets an upper limit right now that depends on
architecture due to paging formats.  But someone really might want to
write a *portable* 64-bit program that allocates memory with the high
16 bits clear.  So let's add such a mechanism directly.

As a thought experiment, what if x86_64 simply never allocated "high"
(above 2^47-1) addresses unless a new mmap-with-explicit-limit syscall
were used?  Old glibc would continue working.  Old VMs would work.
New programs that want to use ginormous mappings would have to use the
new syscall.  This would be totally stateless and would have no issues
with CRIU.

If necessary, we could also have a prctl that changes a
"personality-like" limit that is in effect when the old mmap was used.
I say "personality-like" because it would reset under exactly the same
conditions that personality resets itself.

Thoughts?

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ