[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20210209211738.GA834106@redhat.com>
Date: Tue, 9 Feb 2021 16:17:38 -0500
From: Jerome Glisse <jglisse@...hat.com>
To: Jason Gunthorpe <jgg@...pe.ca>
Cc: Alistair Popple <apopple@...dia.com>,
Daniel Vetter <daniel@...ll.ch>, Linux MM <linux-mm@...ck.org>,
Nouveau Dev <nouveau@...ts.freedesktop.org>,
Ben Skeggs <bskeggs@...hat.com>,
Andrew Morton <akpm@...ux-foundation.org>,
Linux Doc Mailing List <linux-doc@...r.kernel.org>,
Linux Kernel Mailing List <linux-kernel@...r.kernel.org>,
kvm-ppc@...r.kernel.org,
dri-devel <dri-devel@...ts.freedesktop.org>,
John Hubbard <jhubbard@...dia.com>,
Ralph Campbell <rcampbell@...dia.com>
Subject: Re: [PATCH 0/9] Add support for SVM atomics in Nouveau
On Tue, Feb 09, 2021 at 09:35:20AM -0400, Jason Gunthorpe wrote:
> On Tue, Feb 09, 2021 at 11:57:28PM +1100, Alistair Popple wrote:
> > On Tuesday, 9 February 2021 9:27:05 PM AEDT Daniel Vetter wrote:
> > > >
> > > > Recent changes to pin_user_pages() prevent the creation of pinned pages in
> > > > ZONE_MOVABLE. This series allows pinned pages to be created in
> > ZONE_MOVABLE
> > > > as attempts to migrate may fail which would be fatal to userspace.
> > > >
> > > > In this case migration of the pinned page is unnecessary as the page can
> > be
> > > > unpinned at anytime by having the driver revoke atomic permission as it
> > > > does for the migrate_to_ram() callback. However a method of calling this
> > > > when memory needs to be moved has yet to be resolved so any discussion is
> > > > welcome.
> > >
> > > Why do we need to pin for gpu atomics? You still have the callback for
> > > cpu faults, so you
> > > can move the page as needed, and hence a long-term pin sounds like the
> > > wrong approach.
> >
> > Technically a real long term unmoveable pin isn't required, because as you say
> > the page can be moved as needed at any time. However I needed some way of
> > stopping the CPU page from being freed once the userspace mappings for it had
> > been removed.
>
> The issue is you took the page out of the PTE it belongs to, which
> makes it orphaned and unlocatable by the rest of the mm?
>
> Ideally this would leave the PTE in place so everything continues to
> work, just disable CPU access to it.
>
> Maybe some kind of special swap entry?
>
> I also don't much like the use of ZONE_DEVICE here, that should only
> be used for actual device memory, not as a temporary proxy for CPU
> pages.. Having two struct pages refer to the same physical memory is
> pretty ugly.
>
> > The normal solution of registering an MMU notifier to unpin the page when it
> > needs to be moved also doesn't work as the CPU page tables now point to the
> > device-private page and hence the migration code won't call any invalidate
> > notifiers for the CPU page.
>
> The fact the page is lost from the MM seems to be the main issue here.
>
> > Yes, I would like to avoid the long term pin constraints as well if possible I
> > just haven't found a solution yet. Are you suggesting it might be possible to
> > add a callback in the page migration logic to specially deal with moving these
> > pages?
>
> How would migration even find the page?
Migration can scan memory from physical address (isolate_migratepages_range())
So the CPU mapping is not the only path to get to a page.
Cheers,
Jérôme
Powered by blists - more mailing lists