[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20220628140459.GP23621@ziepe.ca>
Date: Tue, 28 Jun 2022 11:04:59 -0300
From: Jason Gunthorpe <jgg@...pe.ca>
To: Steven Sistare <steven.sistare@...cle.com>
Cc: Alex Williamson <alex.williamson@...hat.com>,
lizhe.67@...edance.com, cohuck@...hat.com, kvm@...r.kernel.org,
linux-kernel@...r.kernel.org, lizefan.x@...edance.com
Subject: Re: [PATCH] vfio: remove useless judgement
On Tue, Jun 28, 2022 at 09:54:19AM -0400, Steven Sistare wrote:
> >> As you and I have discussed, the count is also wrong in the direct
> >> exec model, because exec clears mm->locked_vm.
> >
> > Really? Yikes, I thought exec would generate a new mm?
>
> Yes, exec creates a new mm with locked_vm = 0. The old locked_vm count is dropped
> on the floor. The existing dma points to the same task, but task->mm has changed,
> and dma->task->mm->locked_vm is 0. An unmap ioctl drives it
> negative.
Oh.. This is probably a bug, vfio should never use task->mm, the mm
itself should be held using mmgrab instead.
Otherwise exec case is broken as you describe.
> I have prototyped a few possible fixes. One changes vfio to use user->locked_vm.
> Another changes to mm->pinned_vm and preserves it during exec. A third preserves
> mm->locked_vm across exec, but that is not practical, because mm->locked_vm mixes
> vfio pins and mlocks. The mlock component must be cleared during exec, and we don't
> have a separate count for it.
Lossing locked_vm on exec/fork is the correct and expected behavior
for the core kernel code, the bug is that vfio drives it negative.
Jason
Powered by blists - more mailing lists