[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <ZiwC4snk03ptUQij@casper.infradead.org>
Date: Fri, 26 Apr 2024 20:39:14 +0100
From: Matthew Wilcox <willy@...radead.org>
To: David Hildenbrand <david@...hat.com>
Cc: Pasha Tatashin <pasha.tatashin@...een.com>, akpm@...ux-foundation.org,
linux-mm@...ck.org, linux-kernel@...r.kernel.org,
rientjes@...gle.com, dwmw2@...radead.org, baolu.lu@...ux.intel.com,
joro@...tes.org, will@...nel.org, robin.murphy@....com,
iommu@...ts.linux.dev
Subject: Re: [RFC v2 0/3] iommu/intel: Free empty page tables on unmaps
On Fri, Apr 26, 2024 at 04:39:05PM +0200, David Hildenbrand wrote:
> On 26.04.24 15:49, Pasha Tatashin wrote:
> > On Fri, Apr 26, 2024 at 2:42 AM David Hildenbrand <david@...hat.com> wrote:
> > >
> > > On 26.04.24 05:43, Pasha Tatashin wrote:
> > > > Changelog
> > > > ================================================================
> > > > v2: Use mapcount instead of refcount
> > > > Synchronized with IOMMU Observability changes.
> > > > ================================================================
> > > >
> > > > This series frees empty page tables on unmaps. It intends to be a
> > > > low overhead feature.
> > > >
> > > > The read-writer lock is used to synchronize page table, but most of
> > > > time the lock is held is reader. It is held as a writer for short
> > > > period of time when unmapping a page that is bigger than the current
> > > > iova request. For all other cases this lock is read-only.
> > > >
> > > > page->mapcount is used in order to track number of entries at each page
> > > > table.
> > >
> > > I'm wondering if this will conflict with page_type at some point? We're
> > > already converting other page table users to ptdesc. CCing Willy.
> >
> > Hi David,
>
> Hi!
>
> >
> > This contradicts with the following comment in mm_types.h:
> > * If your page will not be mapped to userspace, you can also use the four
> > * bytes in the mapcount union, but you must call
> > page_mapcount_reset()
> > * before freeing it.
>
> I think the documentation is a bit outdated, because we now have page types
> that are: "For pages that are never mapped to userspace"
>
> which includes
>
> #define PG_table
>
> (we should update that comment, because we're now also using it for hugetlb
> that can be mapped to user space, which is fine.)
>
> Right now, using page->_mapcount would likely still be fine, as long as you
> cannot end up creating a value that would resemble a type (e.g., PG_offline
> could be bad).
>
> But staring at users of _mapcount and page_mapcount_reset() ... you'd be
> pretty much the only user of that.
>
> mm/zsmalloc.c calls page_mapcount_reset(), and I am not completely sure why
> ... I can see it touch page->index but not page->_mapcount.
>
>
> Hopefully Willy can comment.
I feel like I have to say "no" to Pasha far too often ;-(
Agreed the documentation is out of date.
I think there's a lot of space in the struct page that can be used.
These are iommu page tables, not cpu page tables, so things are a bit
different for them. But should they be converted to use ptdesc? Maybe!
I'd suggest putting this into the union with pt_mm and pt_frag_refcount.
I think it could even go in the union with pt_list, but I think I'd
rather see it in the pt_mm union.
Powered by blists - more mailing lists