[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <8986d97-3933-8fa7-abba-aabd67924bc2@google.com>
Date: Sat, 26 Feb 2022 19:00:37 -0800 (PST)
From: Hugh Dickins <hughd@...gle.com>
To: Mike Kravetz <mike.kravetz@...cle.com>,
Matthew Wilcox <willy@...radead.org>
cc: Hugh Dickins <hughd@...gle.com>, cgel.zte@...il.com,
kirill@...temov.name, songliubraving@...com,
akpm@...ux-foundation.org, linux-mm@...ck.org,
linux-kernel@...r.kernel.org, yang.yang29@....com.cn,
wang.yong12@....com.cn, Zeal Robot <zealci@....com.cn>
Subject: Re: [PATCH linux-next] Fix shmem huge page failed to set F_SEAL_WRITE
attribute problem
On Thu, 17 Feb 2022, Matthew Wilcox wrote:
> On Wed, Feb 16, 2022 at 05:25:17PM -0800, Hugh Dickins wrote:
> > On Wed, 16 Feb 2022, Mike Kravetz wrote:
> > > On 2/14/22 23:37, cgel.zte@...il.com wrote:
...
> > > > @@ -39,8 +40,12 @@ static void memfd_tag_pins(struct xa_state *xas)
> > > > xas_for_each(xas, page, ULONG_MAX) {
> > > > if (xa_is_value(page))
> > > > continue;
> > > > +
> > > > page = find_subpage(page, xas->xa_index);
> > > > - if (page_count(page) - page_mapcount(page) > 1)
> > > > + count = page_count(page);
> > > > + if (PageTransCompound(page))
> > >
> > > PageTransCompound() is true for hugetlb pages as well as THP. And, hugetlb
> > > pages will not have a ref per subpage as THP does. So, I believe this will
> > > break hugetlb seal usage.
> >
> > Yes, I think so too; and that is not the only issue with the patch
> > (I don't think page_mapcount is enough, I had to use total_mapcount).
Mike, we had the same instinctive reaction to seeing a PageTransCompound
check in code also exposed to PageHuge pages; but in fact that seems to
have worked correctly - those hugetlbfs pages are hard to predict!
But it was not working on pte maps of THPs.
> >
> > It's a good find, and thank you WangYong for the report.
> > I found the same issue when testing my MFD_HUGEPAGE patch last year,
> > and devised a patch to fix it (and keep MFD_HUGETLB working) then; but
> > never sent that in because there wasn't time to re-present MFD_HUGEPAGE.
> >
> > I'm currently retesting my patch: just found something failing which
> > I thought should pass; but maybe I'm confused, or maybe the xarray is
> > working differently now. I'm rushing to reply now because I don't want
> > others to waste their own time on it.
>
> I did change how the XArray works for THP recently.
>
> Kirill's original patch stored:
>
> 512: p
> 513: p+1
> 514: p+2
> ...
> 1023: p+511
>
> A couple of years ago, I changed it to store:
>
> 512: p
> 513: p
> 514: p
> ...
> 1023: p
>
> And in January, Linus merged the commit which changes it to:
>
> 512-575: p
> 576-639: (sibling of 512)
> 640-703: (sibling of 512)
> ...
> 960-1023: (sibling of 512)
>
> That is, I removed a level of the tree and store sibling entries
> rather than duplicate entries. That wasn't for fun; I needed to do
> that in order to make msync() work with large folios. Commit
> 6b24ca4a1a8d has more detail and hopefully can inspire whatever
> changes you need to make to your patch.
Matthew, thanks for the very detailed info, you shouldn't have taken
so much trouble over it: I knew you had done something of that kind,
and yes, that's where my suspicion lay at the time of writing. But
you'll be relieved to know that the patch I wrote before your changes
turned out to be unaffected, and just as valid after your changes.
"just found something failing which I thought should pass" was me
forgetting, again and again, just how limited are the allowed
possibilities for F_SEAL_WRITE when mmaps are outstanding.
One thinks that a PROT_READ, MAP_SHARED mapping would be allowed;
but of course all the memfds are automatically O_RDWR, so mprotect
(no sealing hook) allows it to be changed to PROT_READ|PROT_WRITE,
so F_SEAL_WRITE is forbidden on any MAP_SHARED mapping: only allowed
on MAP_PRIVATEs.
I'll now re-read the commit message I wrote before, update if necessary,
and then send to Andrew, asking him to replace the one in this thread.
Hugh
Powered by blists - more mailing lists