lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [thread-next>] [day] [month] [year] [list]
Message-Id: <1714699270-7360-1-git-send-email-prakash.sangappa@oracle.com>
Date: Thu,  2 May 2024 18:21:09 -0700
From: Prakash Sangappa <prakash.sangappa@...cle.com>
To: linux-mm@...ck.org, linux-kernel@...r.kernel.org
Cc: muchun.song@...ux.dev, akpm@...ux-foundation.org, willy@...radead.org,
        prakash.sangappa@...cle.com
Subject: [RFC PATCH 0/1] Address hugetlbfs mmap behavior

This patch proposes to fix hugetlbfs mmap behavior so that the 
file size does not get updated in the mmap call. 

The current behavior is that hugetlbfs file size will get extended by a 
PROT_WRITE mmap(2) call if mmap size is greater then file size. This is
not normal filesystem behavior.

There seem to have been very little discussion about this. There was a
patch discussion[1] a while back, implying hugetlbfs file size needs
extending because of the hugetlb page reservations. Looks like this was
not merged.

It appears there is no correlation between file size and hugetlb page
reservations. Take the case of PROT_READ mmap, where the file size is
not extended even though hugetlb pages are reserved. 

On the other hand ftruncate(2) to increase a file size does not reserve
hugetlb pages. Also, mmap with MAP_NORESERVE flag extends the file size 
even though hugetlb pages are not reserved. 

Hugetlb pages get reserved(if MAP_NORESERVE is not specified) when the
hugeltbfs file is mmapped, and it only covers the file's offset,length 
range specified in the mmap call.

Issue:

Some applications would prefer to manage hugetlb page allocations explicity
with use of fallocate(2). The hugetlbfs file would be PROT_WRITE mapped with
MAP_NORESERVE flag, which is accessed only after allocating necessary pages
using fallocate(2) and release the pages by truncating the file size. Any stray
access beyond file size is expected to generate a signal. This does not 
work properly due to current behavior which extends file size in mmap call.

To address this issue, hugetlbfs behavior needs to be fixed to not extend
file size in mmap(2) call. 

However changing current hugetlbfs behavior could potentially break some 
applications. Therefore this patch proposes a mount option to hugetlbfs
to choose the mmap behavior of not extending file size.
Use of a mount option was suggested by Matthew Wilcox, 

This patch adds a 'nommapfilesz' mount option to hugetlbfs mount option. The
mount option name can be changed if there is a better name suggested.

Submitting this patch as a RFC to get feedback on the approach and if there
is any reason that requires file size to be extended by mmap in hugetlbfs case.

[1] https://lore.kernel.org/lkml/200603081828.k28ISgg10244@unix-os.sc.intel.com/


Prakash Sangappa (1):
  hugetlbfs: Add mount option to choose normal mmap behavior

 fs/hugetlbfs/inode.c    | 19 ++++++++++++++++++-
 include/linux/hugetlb.h |  1 +
 2 files changed, 19 insertions(+), 1 deletion(-)

-- 
2.7.4


Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ