[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <c0f57bfc-d34b-a34b-4f2d-0d66782e4ae7@nvidia.com>
Date: Thu, 3 Feb 2022 16:59:56 -0800
From: John Hubbard <jhubbard@...dia.com>
To: Jason Gunthorpe <jgg@...dia.com>
Cc: Lukas Bulwahn <lukas.bulwahn@...il.com>,
Andrew Morton <akpm@...ux-foundation.org>,
Linux Kernel Mailing List <linux-kernel@...r.kernel.org>,
Linux-MM <linux-mm@...ck.org>, Peter Xu <peterx@...hat.com>,
Alex Williamson <alex.williamson@...hat.com>,
Andrea Arcangeli <aarcange@...hat.com>,
David Hildenbrand <david@...hat.com>, Jan Kara <jack@...e.cz>,
"Kirill A. Shutemov" <kirill.shutemov@...ux.intel.com>
Subject: Re: Weird code with change "mm/gup: clean up follow_pfn_pte()
slightly"
On 2/3/22 16:45, Jason Gunthorpe wrote:
> On Thu, Feb 03, 2022 at 12:44:57PM -0800, John Hubbard wrote:
>> On 2/3/22 05:01, Jason Gunthorpe wrote:
>> ...
>>>>> In the new branch if (pages), you set page = ERR_PTR(-EFAULT) and goto
>>>>> out. However, at the label out, the value of page is not used, but the
>>>>> return uses the variables i and ret.
>>>>
>>>> Yes, I think that the complaint is accurate. The intent of this code is
>>>> to return either number of pages so far (i) or ret (which should be zero
>>>> in this case), because we are just stopping early, rather than calling
>>>> this an actual error.
>>>
>>> IIRC GUP shouldn't return 0, it should return an error code, not zero.
>>>
>>> Jason
>>
>> Errors work for single pages, but GUP is a multi-page API call. If it
>> returned an error part way through the list of pages, then callers would
>> have no way of knowing how many pages to release.
>
> Yes, but that is returning a positive error code, I said it should not
> return zero.
>
> When it hits an error with pages already loaded it returns that number
> and the caller will then do gup once more with the VA pointing at the
> problematic page. Then GUP can return the error code because it has 0
> pages on the next iteration.
>
> It should not return 0 here when it got an error.
This is perhaps better API design, but it's not what exists now. The call
sites today handle 0 pages ret value correctly, already. There are lots
of call sites. Is this worth changing?
Also, to be clear, are you proposing just handling zero as a special,
or something more extensive? Because after we get N pages into it,
someone has to unpin those pages, and it's been up to the caller so far.
>
>> * Returns either number of pages pinned (which may be less than the
>> * number requested), or an error. Details about the return value:
>> *
>> * -- If nr_pages is 0, returns 0.
>> * -- If nr_pages is >0, but no pages were pinned, returns -errno.
>> * -- If nr_pages is >0, and some pages were pinned, returns the number of
>> * pages pinned. Again, this may be less than nr_pages.
>> * -- 0 return value is possible when the fault would need to be retried.
>
> I actually don't know of any place that handles the 0 return code, or
> what 'fault would need to be retried' is supposed to mean for the
> caller ...
>
There are quite a few places that handle a 0 return, and they understand
that it is an error for their case. For example:
static int non_atomic_pte_lookup(struct vm_area_struct *vma,
unsigned long vaddr, int write,
unsigned long *paddr, int *pageshift)
{
struct page *page;
#ifdef CONFIG_HUGETLB_PAGE
*pageshift = is_vm_hugetlb_page(vma) ? HPAGE_SHIFT : PAGE_SHIFT;
#else
*pageshift = PAGE_SHIFT;
#endif
if (get_user_pages(vaddr, 1, write ? FOLL_WRITE : 0, &page, NULL) <= 0)
return -EFAULT;
*paddr = page_to_phys(page);
put_page(page);
return 0;
}
thanks,
--
John Hubbard
NVIDIA
Powered by blists - more mailing lists