[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20200114125957.GO20978@mellanox.com>
Date: Tue, 14 Jan 2020 13:00:01 +0000
From: Jason Gunthorpe <jgg@...lanox.com>
To: Ralph Campbell <rcampbell@...dia.com>
CC: "linux-rdma@...r.kernel.org" <linux-rdma@...r.kernel.org>,
"linux-mm@...ck.org" <linux-mm@...ck.org>,
"linux-kernel@...r.kernel.org" <linux-kernel@...r.kernel.org>,
"nouveau@...ts.freedesktop.org" <nouveau@...ts.freedesktop.org>,
"linux-kselftest@...r.kernel.org" <linux-kselftest@...r.kernel.org>,
Jerome Glisse <jglisse@...hat.com>,
John Hubbard <jhubbard@...dia.com>,
Christoph Hellwig <hch@....de>,
Andrew Morton <akpm@...ux-foundation.org>,
Ben Skeggs <bskeggs@...hat.com>, Shuah Khan <shuah@...nel.org>
Subject: Re: [PATCH v6 5/6] nouveau: use new mmu interval notifiers
On Mon, Jan 13, 2020 at 02:47:02PM -0800, Ralph Campbell wrote:
> void
> nouveau_svmm_fini(struct nouveau_svmm **psvmm)
> {
> struct nouveau_svmm *svmm = *psvmm;
> + struct mmu_interval_notifier *mni;
> +
> if (svmm) {
> mutex_lock(&svmm->mutex);
> + while (true) {
> + mni = mmu_interval_notifier_find(svmm->mm,
> + &nouveau_svm_mni_ops, 0UL, ~0UL);
> + if (!mni)
> + break;
> + mmu_interval_notifier_put(mni);
Oh, now I really don't like the name 'put'. It looks like mni is
refcounted here, and it isn't. put should be called 'remove_deferred'
And then you also need a way to barrier this scheme on driver unload.
> + }
> svmm->vmm = NULL;
> mutex_unlock(&svmm->mutex);
> - mmu_notifier_put(&svmm->notifier);
While here it was actually a refcount.
> +static void nouveau_svmm_do_unmap(struct mmu_interval_notifier *mni,
> + const struct mmu_notifier_range *range)
> +{
> + struct svmm_interval *smi =
> + container_of(mni, struct svmm_interval, notifier);
> + struct nouveau_svmm *svmm = smi->svmm;
> + unsigned long start = mmu_interval_notifier_start(mni);
> + unsigned long last = mmu_interval_notifier_last(mni);
This whole algorithm only works if it is protected by the read side of
the interval tree lock. Deserves at least a comment if not an
assertion too.
> static int nouveau_range_fault(struct nouveau_svmm *svmm,
> struct nouveau_drm *drm, void *data, u32 size,
> - u64 *pfns, struct svm_notifier *notifier)
> + u64 *pfns, u64 start, u64 end)
> {
> unsigned long timeout =
> jiffies + msecs_to_jiffies(HMM_RANGE_DEFAULT_TIMEOUT);
> /* Have HMM fault pages within the fault window to the GPU. */
> struct hmm_range range = {
> - .notifier = ¬ifier->notifier,
> - .start = notifier->notifier.interval_tree.start,
> - .end = notifier->notifier.interval_tree.last + 1,
> + .start = start,
> + .end = end,
> .pfns = pfns,
> .flags = nouveau_svm_pfn_flags,
> .values = nouveau_svm_pfn_values,
> + .default_flags = 0,
> + .pfn_flags_mask = ~0UL,
> .pfn_shift = NVIF_VMM_PFNMAP_V0_ADDR_SHIFT,
> };
> - struct mm_struct *mm = notifier->notifier.mm;
> + struct mm_struct *mm = svmm->mm;
> long ret;
>
> while (true) {
> if (time_after(jiffies, timeout))
> return -EBUSY;
>
> - range.notifier_seq = mmu_interval_read_begin(range.notifier);
> - range.default_flags = 0;
> - range.pfn_flags_mask = -1UL;
> down_read(&mm->mmap_sem);
mmap sem doesn't have to be held for the interval search, and again we
have lifetime issues with the membership here.
> + ret = nouveau_svmm_interval_find(svmm, &range);
> + if (ret) {
> + up_read(&mm->mmap_sem);
> + return ret;
> + }
> + range.notifier_seq = mmu_interval_read_begin(range.notifier);
> ret = hmm_range_fault(&range, 0);
> up_read(&mm->mmap_sem);
> if (ret <= 0) {
I'm still not sure this is a better approach than what ODP does. It
looks very expensive on the fault path..
Jason
Powered by blists - more mailing lists