linux-kernel - Re: [PATCH 4/6] nouveau: unlock mmap_sem on all errors from nouveau_range

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <20190723171731.GD15357@ziepe.ca>
Date:   Tue, 23 Jul 2019 14:17:31 -0300
From:   Jason Gunthorpe <jgg@...pe.ca>
To:     Christoph Hellwig <hch@....de>
Cc:     Jérôme Glisse <jglisse@...hat.com>,
        Ben Skeggs <bskeggs@...hat.com>,
        Ralph Campbell <rcampbell@...dia.com>,
        "linux-mm@...ck.org" <linux-mm@...ck.org>,
        "nouveau@...ts.freedesktop.org" <nouveau@...ts.freedesktop.org>,
        "dri-devel@...ts.freedesktop.org" <dri-devel@...ts.freedesktop.org>,
        "linux-kernel@...r.kernel.org" <linux-kernel@...r.kernel.org>
Subject: Re: [PATCH 4/6] nouveau: unlock mmap_sem on all errors from
 nouveau_range_fault

On Tue, Jul 23, 2019 at 06:30:48PM +0200, Christoph Hellwig wrote:
> On Tue, Jul 23, 2019 at 03:18:28PM +0000, Jason Gunthorpe wrote:
> > Hum..
> > 
> > The caller does this:
> > 
> > again:
> > 		ret = nouveau_range_fault(&svmm->mirror, &range);
> > 		if (ret == 0) {
> > 			mutex_lock(&svmm->mutex);
> > 			if (!nouveau_range_done(&range)) {
> > 				mutex_unlock(&svmm->mutex);
> > 				goto again;
> > 
> > And we can't call nouveau_range_fault() -> hmm_range_fault() without
> > holding the mmap_sem, so we can't allow nouveau_range_fault to unlock
> > it.
> 
> Goto again can only happen if nouveau_range_fault was successful,
> in which case we did not drop mmap_sem.

Oh, right we switch from success = number of pages to success =0..

However the reason this looks so weird to me is that the locking
pattern isn't being followed, any result of hmm_range_fault should be
ignored until we enter the svmm->mutex and check if there was a
colliding invalidation.

So the 'goto again' *should* be possible even if range_fault failed.

But that is not for this patch..

> >  	ret = hmm_range_fault(range, true);
> >  	if (ret <= 0) {
> >  		if (ret == 0)
> >  			ret = -EBUSY;
> > -		up_read(&range->vma->vm_mm->mmap_sem);
> >  		hmm_range_unregister(range);
> 
> This would hold mmap_sem over hmm_range_unregister, which can lead
> to deadlock if we call exit_mmap and then acquire mmap_sem again.

That reminds me, this code is also leaking hmm_range_unregister() in
the success path, right?

I think the right way to structure this is to move the goto again and
related into the nouveau_range_fault() so the whole retry algorithm is
sensibly self contained.

Jason