lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <03127895-3c5a-5182-82de-3baa3116749e@oracle.com>
Date:   Tue, 2 May 2017 16:34:18 -0700
From:   Prakash Sangappa <prakash.sangappa@...cle.com>
To:     Dave Hansen <dave.hansen@...el.com>, linux-kernel@...r.kernel.org,
        linux-mm@...ck.org
Subject: Re: [PATCH RFC] hugetlbfs 'noautofill' mount option



On 5/2/17 2:32 PM, Dave Hansen wrote:
> On 05/01/2017 11:00 AM, Prakash Sangappa wrote:
>> This patch adds a new hugetlbfs mount option 'noautofill', to indicate that
>> pages should not be allocated at page fault time when accessed thru mmapped
>> address.
> I think the main argument against doing something like this is further
> specializing hugetlbfs.  I was really hoping that userfaultfd would be
> usable for your needs here.
>
> Could you elaborate on other options that you considered?  Did you look
> at userfaultfd?  What about an madvise() option that disallows backing
> allocations?


Yes, we did consider userfaultfd and madvise(). The use case in mind is 
the database.

With a database, large number of single threaded processes are involved 
which
will map hugetlbfs file and use it for shared memory. The concern with 
using
userfaultfd is the overhead of setup and having an additional thread per 
process
to monitor the userfaultfd. Even if the additional thread can be 
avoided, by using
an external monitor process and  each process sending the userfaultfd to 
this
monitor process, setup overhead exists.

Similarly, a madvise() option also requires additional system call by every
process mapping the file, this is considered a overhead for the database.

If we do consider a new madvise() option, will it be acceptable since 
this will be
specifically for hugetlbfs file mappings? If so, would a new flag to mmap()
call itself be acceptable, which would define the proposed behavior?. 
That way
no additional system calls need to be made. Again this mmap flag would be
applicable  specifically to hugetlbfs file mappings

With the proposed mount option, it would enforce one consistent behavior
and the application using this filesystem would not have to take additional
steps as with userfaultfd or madvise().

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ