[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20220204010612.GO1786498@nvidia.com>
Date: Thu, 3 Feb 2022 21:06:12 -0400
From: Jason Gunthorpe <jgg@...dia.com>
To: John Hubbard <jhubbard@...dia.com>
Cc: Lukas Bulwahn <lukas.bulwahn@...il.com>,
Andrew Morton <akpm@...ux-foundation.org>,
Linux Kernel Mailing List <linux-kernel@...r.kernel.org>,
Linux-MM <linux-mm@...ck.org>, Peter Xu <peterx@...hat.com>,
Alex Williamson <alex.williamson@...hat.com>,
Andrea Arcangeli <aarcange@...hat.com>,
David Hildenbrand <david@...hat.com>, Jan Kara <jack@...e.cz>,
"Kirill A. Shutemov" <kirill.shutemov@...ux.intel.com>
Subject: Re: Weird code with change "mm/gup: clean up follow_pfn_pte()
slightly"
On Thu, Feb 03, 2022 at 04:59:56PM -0800, John Hubbard wrote:
> On 2/3/22 16:45, Jason Gunthorpe wrote:
> > On Thu, Feb 03, 2022 at 12:44:57PM -0800, John Hubbard wrote:
> > > On 2/3/22 05:01, Jason Gunthorpe wrote:
> > > ...
> > > > > > In the new branch if (pages), you set page = ERR_PTR(-EFAULT) and goto
> > > > > > out. However, at the label out, the value of page is not used, but the
> > > > > > return uses the variables i and ret.
> > > > >
> > > > > Yes, I think that the complaint is accurate. The intent of this code is
> > > > > to return either number of pages so far (i) or ret (which should be zero
> > > > > in this case), because we are just stopping early, rather than calling
> > > > > this an actual error.
> > > >
> > > > IIRC GUP shouldn't return 0, it should return an error code, not zero.
> > > >
> > > > Jason
> > >
> > > Errors work for single pages, but GUP is a multi-page API call. If it
> > > returned an error part way through the list of pages, then callers would
> > > have no way of knowing how many pages to release.
> >
> > Yes, but that is returning a positive error code, I said it should not
> > return zero.
> >
> > When it hits an error with pages already loaded it returns that number
> > and the caller will then do gup once more with the VA pointing at the
> > problematic page. Then GUP can return the error code because it has 0
> > pages on the next iteration.
> >
> > It should not return 0 here when it got an error.
>
> This is perhaps better API design, but it's not what exists now.
I think it is what exists today, 0 certainly is not implemented as
'need retry' anywhere I found.
So why do we return 0, if it means an error, instead of returning the
actual errno?
> The call sites today handle 0 pages ret value correctly,
This isn't correct though:
if (get_user_pages(vaddr, 1, write ? FOLL_WRITE : 0, &page, NULL) <= 0)
return -EFAULT;
If GUP wanted the caller to permanently fail with -EFAULT, it should
have directly returned EFAULT.
0 means 'to be retried', whatever that means, and there is no retry
in the above.
IOW, the above does not handle a 0 return correctly, according to the
comment.
Jason
Powered by blists - more mailing lists