[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20230717185222.GB4741@monkey>
Date: Mon, 17 Jul 2023 11:52:22 -0700
From: Mike Kravetz <mike.kravetz@...cle.com>
To: Vlastimil Babka <vbabka@...e.cz>
Cc: Matthew Wilcox <willy@...radead.org>, linux-mm@...ck.org,
linux-kernel@...r.kernel.org, David Hildenbrand <david@...hat.com>,
Rik van Riel <riel@...riel.com>,
Michal Hocko <mhocko@...e.com>, Peter Xu <peterx@...hat.com>,
Dave Hansen <dave.hansen@...ux.intel.com>
Subject: Re: mprotect and hugetlb mappings
On 07/17/23 18:19, Vlastimil Babka wrote:
> On 7/6/23 01:53, Mike Kravetz wrote:
> > On 07/06/23 00:22, Matthew Wilcox wrote:
> >> On Wed, Jul 05, 2023 at 04:08:08PM -0700, Mike Kravetz wrote:
> >> > I was recently asked about the behavior of mprotect on a hugetlb
> >> > mapping where addr or addr+len is not hugetlb page size aligned. As
> >> > one might expect, EINVAL is returned in such cases. However, the man
> >> > page makes no mention of alignment requirements for hugetlb mappings.
> >> >
> >> > I am happy to submit man page updates if people agree this is the correct
> >> > behavior. We might even want to check alignment earlier in the code
> >> > path as we fail when trying to split the vma today.
> >> >
> >> > An alternative behavior would be to operate on whole hugetlb pages within
> >> > the range addr - addr+len.
> >>
> >> After a careful re-reading of the mprotect() man page, I suggest the
> >> following behaviour ...
> >>
> >> addr must be a multiple of the hpage size. Otherwise -EINVAL.
> >> len should be rounded up to hpage size.
> >>
> >> I wonder how likely this change would be to break userspace code.
> >> Maybe some test cases.
> >
> > My concern is that this is the approach I took with huegtlb MADV_DONTNEED,
> > and this caused problems discussed and eventually modified here:
> > https://lore.kernel.org/linux-mm/20221021154546.57df96db@imladris.surriel.com/
> >
> > In the MADV_DONTNEED case we were throwing away data. With mprotect we are
> > only modifying access to data.
>
> That can still confuse some userspace, no? I think realistically we can only
> document the current implementation better, maybe improve it without
> changing observed behavior as you suggested wrt the split vma fail. But
> changing it would be dangerous.
Thanks for the comments Vlastimil.
That would be my thought/preferred path forward. Simply document the
current behavior, and MAYBE update code to be more explicit.
Any other thoughts?
--
Mike Kravetz
Powered by blists - more mailing lists