[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <912f9f23-fa2a-1dd7-3f91-f7175094c2e2@nvidia.com>
Date: Mon, 18 Nov 2019 10:32:18 -0800
From: Ralph Campbell <rcampbell@...dia.com>
To: Jason Gunthorpe <jgg@...lanox.com>
CC: Christoph Hellwig <hch@....de>,
Andrew Morton <akpm@...ux-foundation.org>,
Jerome Glisse <jglisse@...hat.com>,
John Hubbard <jhubbard@...dia.com>,
Shuah Khan <shuah@...nel.org>,
"linux-rdma@...r.kernel.org" <linux-rdma@...r.kernel.org>,
"linux-mm@...ck.org" <linux-mm@...ck.org>,
"linux-kernel@...r.kernel.org" <linux-kernel@...r.kernel.org>,
"linux-kselftest@...r.kernel.org" <linux-kselftest@...r.kernel.org>
Subject: Re: [PATCH v4 2/2] mm/hmm/test: add self tests for HMM
On 11/15/19 6:06 AM, Jason Gunthorpe wrote:
> On Thu, Nov 14, 2019 at 03:06:05PM -0800, Ralph Campbell wrote:
>>
>> On 11/13/19 5:51 AM, Christoph Hellwig wrote:
>>> On Tue, Nov 12, 2019 at 11:45:52PM +0000, Jason Gunthorpe wrote:
>>>>> Well, it would mean registering for the whole process address space.
>>>>> I'll give it a try.
>>>>
>>>> I'm not sure it makes much sense that this testing is essentially
>>>> modeled after nouveau's usage which is very strange compared to the
>>>> other drivers.
>>>
>>> Which means we really should make the test cases fit the proper usage.
>>> Maybe defer the tests for 5.5 and just merge the first patch for now?
>>>
>>
>> I think this a good point to discuss.
>> Some devices will want to register for all changes to the process address
>> space because there is no requirement to preregister regions that the
>> device can access verses devices like InfiniBand where a range of addresses
>> have to be registered before the device can access those addresses.
>
> But this is a very bad idea to register and do HW actions for ranges
> that can't possibly have any pages registered. It slows down the
> entire application
>
> I think the ODP approach might be saner, when it mirrors the entire
> address space it chops it up into VA chunks, and once a page is
> registered on the HW the VA chunk goes into the interval tree.
>
> Presumably the GPU also has some kind of page table tree and you could
> set one of the levels as the VA interval when there are populated children
>
> Jason
I wasn't suggesting that HW invalidates happen in two places.
I'm suggesting the two styles of invalidates can work together.
For example, what if a driver calls mmu_notifier_register(mn, mm)
to register for address space wide invalidations, then some time
later there is a device page table fault and the driver calls
mmu_range_notifier_insert() but with a NULL ops.invalidate.
The fault handler follows the nouveau/test_hmm pattern to call
mmu_range_read_begin()
hmm_range_fault()
device lock
mmu_range_read_retry()
update device page tables
device unlock
mmu_range_notifier_remove()
The global invalidate() callback would get the device lock and
call into mm to update the sequence number of any affected ranges
instead of having a range invalidate callback, and then do the HW
invalidations.
Powered by blists - more mailing lists