lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <alpine.DEB.2.10.1503261201440.8238@chino.kir.corp.google.com>
Date:	Thu, 26 Mar 2015 12:15:24 -0700 (PDT)
From:	David Rientjes <rientjes@...gle.com>
To:	Davide Libenzi <davidel@...ilserver.org>
cc:	Hugh Dickins <hughd@...gle.com>,
	Andrew Morton <akpm@...ux-foundation.org>,
	KOSAKI Motohiro <kosaki.motohiro@...fujitsu.com>,
	Andrea Arcangeli <aarcange@...hat.com>,
	Joern Engel <joern@...fs.org>,
	Jianguo Wu <wujianguo@...wei.com>,
	Eric B Munson <emunson@...mai.com>, linux-mm@...ck.org,
	Linux Kernel Mailing List <linux-kernel@...r.kernel.org>
Subject: Re: [patch][resend] MAP_HUGETLB munmap fails with size not 2MB
 aligned

On Thu, 26 Mar 2015, Davide Libenzi wrote:

> > I looked at this thread at http://marc.info/?t=141392508800001 since I 
> > didn't have it in my mailbox, and I didn't get a chance to actually run 
> > your test code.
> > 
> > In short, I think what you're saying is that
> > 
> > 	ptr = mmap(..., 4KB, ..., MAP_HUGETLB | ..., ...)
> > 	munmap(ptr, 4KB) == EINVAL
> 
> I am not sure you have read the email correctly:
> 
> munmap(mmap(size, HUGETLB), size) = EFAIL
> 
> For every size not multiple of the huge page size.
> Whereas:
> 
> munmap(mmap(size, HUGETLB), ALIGN(size, HUGEPAGE_SIZE)) = OK
> 

Yes, I read it correctly, and wrote how your test case should have failed 
above.  It fails when you do the 4KB mmap() with MAP_HUGETLB and pass 4KB 
to munmap(), correct?

I have no idea what EFAIL is, though.

> > The question you pose is whether munmap(ptr, 4KB) should succeed for a 
> > hugetlb vma and in your patch you align this to the hugepage size of the 
> > vma in the same manner that munmap(ptr, 2KB) would be aligned to PAGE_SIZE 
> > for a non-hugetlb vma.
> > 
> > The munmap() spec says the whole pages that include any part of the passed 
> > length should be unmapped.  In spirit, I would agree with you that the 
> > page size for the vma is the hugepage size so that would be what would be 
> > unmapped.
> > 
> > But that's going by a spec that doesn't address hugepages and is worded in 
> > a way that {PAGE_SIZE} is the base unit that both mmap() and munmap() is 
> > done.  It carries no notion of variable page sizes and how hugepages 
> > should be handled with respect to pages of {PAGE_SIZE} length.  So I think 
> > this is beyond the scope of the spec: any length is aligned to PAGE_SIZE, 
> > but the munmap() behavior for hugetlb vmas is not restricted.
> > 
> > It would seem too dangerous at this point to change the behavior of 
> > munmap(ptr, 4KB) on a hugetlb vma and that userspace bugs could actually 
> > arise from aligning to the hugepage size.
> 
> You mean, there is an harder failure than the current failure? :)
> 

Yes, this munmap() behavior of lengths <= hugepage_size - PAGE_SIZE for a 
hugetlb vma is long standing and there may be applications that break as a 
result of changing the behavior: a database that reserves all allocated 
hugetlb memory with mmap() so that it always has exclusive access to those 
hugepages, whether they are faulted or not, and maintains its own hugepage 
pool (which is common), may test the return value of munmap() and depend 
on it returning -EINVAL to determine if it is freeing memory that was 
either dynamically allocated or mapped from the hugetlb reserved pool.

> > Some applications purposefully reserve hugetlb pages by mmap() and never 
> > munmap() them so they have exclusive access to hugepages that were 
> > allocated either at boot or runtime by the sysadmin.  If they depend on 
> > the return value of munmap() to determine if memory to free is memory 
> > dynamically allocated by the application or reserved as hugetlb memory, 
> > then this would cause them to break.  I can't say for certain that no such 
> > application exists.
> 
> The fact that certain applications will seldomly call an API, should be no 
> reason for API to have bugs, or at the very least, a bahviour which not 
> only in not documented in the man pages, but also totally unrespectful of 
> the normal mmap/munmap semantics.

If we were to go back in time and decide this when the munmap() behavior 
for hugetlb vmas was originally introduced, that would be valid.  The 
problem is that it could lead to userspace breakage and that's a 
non-starter.

What we can do is improve the documentation and man-page to clearly 
specify the long-standing behavior so that nobody encounters unexpected 
results in the future.
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ