lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite for Android: free password hash cracker in your pocket
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Date:	Tue, 30 Sep 2014 05:36:47 -0400
From:	Daniel Micay <danielmicay@...il.com>
To:	Andy Lutomirski <luto@...capital.net>
CC:	"linux-mm@...ck.org" <linux-mm@...ck.org>,
	"linux-kernel@...r.kernel.org" <linux-kernel@...r.kernel.org>,
	Linux API <linux-api@...r.kernel.org>,
	Andrew Morton <akpm@...ux-foundation.org>, jasone@...onware.com
Subject: Re: [PATCH v3] mm: add mremap flag for preserving the old mapping

On 30/09/14 01:53 AM, Andy Lutomirski wrote:
> On Mon, Sep 29, 2014 at 9:55 PM, Daniel Micay <danielmicay@...il.com> wrote:
>> This introduces the MREMAP_RETAIN flag for preserving the source mapping
>> when MREMAP_MAYMOVE moves the pages to a new destination. Accesses to
>> the source location will fault and cause fresh pages to be mapped in.
>>
>> For consistency, the old_len >= new_len case could decommit the pages
>> instead of unmapping. However, userspace can accomplish the same thing
>> via madvise and a coherent definition of the flag is possible without
>> the extra complexity.
> 
> IMO this needs very clear documentation of exactly what it does.

Agreed, and thanks for the review. I'll post a slightly modified version
of the patch soon (mostly more commit message changes).

> Does it preserve the contents of the source pages?  (If so, why?
> Aren't you wasting a bunch of time on page faults and possibly
> unnecessary COWs?)

The source will act as if it was just created. For an anonymous memory
mapping, it will fault on any accesses and bring in new zeroed pages.

In jemalloc, it replaces an enormous memset(dst, src, size) followed by
madvise(src, size, MADV_DONTNEED) with mremap. Using mremap also ends up
eliding page faults from writes at the destination.

TCMalloc has nearly the same page allocation design, although it tries
to throttle the purging so it won't always gain as much.

> Does it work on file mappings?  Can it extend file mappings while it moves them?

It works on file mappings. If a move occurs, there will be the usual
extended destination mapping but with the source mapping left intact.

It wouldn't be useful with existing allocators, but in theory a general
purpose allocator could expose an MMIO API in order to reuse the same
address space via MAP_FIXED/MREMAP_FIXED to reduce VM fragmentation.

> If you MREMAP_RETAIN a partially COWed private mapping, what happens?

The original mapping is zeroed in the following test, as it would be
without fork:

#define _GNU_SOURCE

#include <string.h>
#include <stdlib.h>
#include <sys/mman.h>
#include <unistd.h>
#include <sys/wait.h>

int main(void) {
  size_t size = 1024 * 1024;
  char *orig = mmap(NULL, size, PROT_READ|PROT_WRITE,
                    MAP_PRIVATE|MAP_ANONYMOUS, -1, 0);
  memset(orig, 5, size);
  int pid = fork();
  if (pid == -1)
    return 1;
  if (pid == 0) {
    memset(orig, 5, 1024);
    char *new = mremap(orig, size, size * 128, MREMAP_MAYMOVE|4);
    if (new == orig) return 1;
    for (size_t i = 0; i < size; i++)
      if (new[i] != 5)
        return 1;
    for (size_t i = 0; i < size; i++)
      if (orig[i] != 0)
        return 1;
    return 0;
  }
  int status;
  if (wait(&status) < -1) return 1;
  if (WIFEXITED(status))
    return WEXITSTATUS(status);
  return 1;
}

Hopefully this is the case you're referring to. :)

> Does it work on special mappings?  If so, please prevent it from doing
> so.  mremapping x86's vdso is a thing, and duplicating x86's vdso
> should not become a thing, because x86_32 in particular will become
> extremely confused.

I'll add a check for arch_vma_name(vma) == NULL.

There's an existing check for VM_DONTEXPAND | VM_PFNMAP when expanding
allocations (the only case this flag impacts). Are there other kinds of
special mappings that you're referring to?
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ