[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <CAFqt6zaO06gAJjWTLP4fGooLPHcm+oUJtpvhfpr6A0Zsj4PE7g@mail.gmail.com>
Date: Wed, 24 Jun 2020 22:11:08 +0530
From: Souptick Joarder <jrdr.linux@...il.com>
To: Boris Ostrovsky <boris.ostrovsky@...cle.com>
Cc: Juergen Gross <jgross@...e.com>, sstabellini@...nel.org,
xen-devel@...ts.xenproject.org, linux-kernel@...r.kernel.org,
John Hubbard <jhubbard@...dia.com>, paul@....org
Subject: Re: [RFC PATCH v2] xen/privcmd: Convert get_user_pages*() to pin_user_pages*()
On Wed, Jun 24, 2020 at 9:07 PM Boris Ostrovsky
<boris.ostrovsky@...cle.com> wrote:
>
> On 6/23/20 9:36 PM, Souptick Joarder wrote:
> > On Tue, Jun 23, 2020 at 11:11 PM Boris Ostrovsky
> > <boris.ostrovsky@...cle.com> wrote:
> >> On 6/23/20 7:58 AM, Souptick Joarder wrote:
> >>> In 2019, we introduced pin_user_pages*() and now we are converting
> >>> get_user_pages*() to the new API as appropriate. [1] & [2] could
> >>> be referred for more information. This is case 5 as per document [1].
> >>>
> >>> As discussed, pages need to be marked as dirty before unpinned it.
> >>>
> >>> Previously, if lock_pages() end up partially mapping pages, it used
> >>> to return -ERRNO due to which unlock_pages() have to go through
> >>> each pages[i] till *nr_pages* to validate them. This can be avoided
> >>> by passing correct number partially mapped pages & -ERRNO separately
> >>> while returning from lock_pages() due to error.
> >>> With this fix unlock_pages() doesn't need to validate pages[i] till
> >>> *nr_pages* for error scenario.
> >>
> >> This should be split into two patches please. The first one will fix the
> >> return value bug (and will need to go to stable branches) and the second
> >> will use new routine to pin pages.
> > Initially I split the patches into 2 commits. But at last moment I
> > figure out that,
> > this bug fix ( better to call coding error, doesn't looks like lead to
> > any runtime bug) is tightly coupled to 2nd commit for
> > pin_user_pages*() conversion,
> > which means we don't need the bug fix patch if we are not converting the API to
> > pin_user_pages*()/ unpin_user_pages_dirty_lock(). That's the reason to
> > clubbed these two
> > commits into a single one.
>
>
> I am not sure I understand why the two issues are connected. Failure of
> either get_user_pages_fast() or pin_user_pages() will result in the same
> kind of overall ioctl failure, won't it?
>
Sorry, I think, I need to go through the bug again for my clarity.
My understanding is ->
First, In case of success lock_pages() returns 0, so what privcmd_ioctl_dm_op()
will return to user is depend on the return value of HYPERVISOR_dm_op()
and at last all the pages are unpinned. At present I am not clear about the
return value of HYPERVISOR_dm_op(). But this path of code looks to be correct.
second, incase of failure from lock_pages() will return either -ENOSPC / -ERRNO
receive from {get|pin|_user_pages_fast() inside for loop. (at that
point there might
be some partially mapped pages as it is mapping inside the loop). Upon
receiving
-ERRNO privcmd_ioctl_dm_op() will pass the same -ERRNO to user (not the partial
mapped page count). This looks to be correct behaviour from user space.
The problem I was trying to address is, in the second scenario when
lock_pages() returns
-ERRNO and there are already existing mapped pages. In this scenario,
when unlcok_pages()
is called it will iterate till *nr_pages* to validate and unmap the
pages. But it is supposed to
unmap only the mapped pages which I am trying to address as part of bug fix.
Let me know if I am able to be in sync with what you are expecting ?
> One concern though is that we are changing how user will see this error.
My understanding from above is user will always see right -ERRNO and
will not be impacted.
Another scenario I was thinking, if {get|pin|_user_pages_fast() return
number of pages pinned
which may be fewer than the number requested. Then lock_pages()
returns 0 to caller
and caller/user space will continue with the assumption that all pages
are pinned correctly.
Is this an expected behaviour as per current code ?
> Currently Xen devicemodel (which AFAIK is the only caller) ignores it,
> which is I think is wrong. So another option would be to fix this in Xen
> and continue returning positive number here. I guess it's back to Paul
> again.
>
>
> -boris
>
>
Powered by blists - more mailing lists