lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <afedd0c6-23c0-7e79-3b14-48fffaed7f99@nvidia.com>
Date:   Thu, 3 Feb 2022 17:22:36 -0800
From:   John Hubbard <jhubbard@...dia.com>
To:     Jason Gunthorpe <jgg@...dia.com>
Cc:     Lukas Bulwahn <lukas.bulwahn@...il.com>,
        Andrew Morton <akpm@...ux-foundation.org>,
        Linux Kernel Mailing List <linux-kernel@...r.kernel.org>,
        Linux-MM <linux-mm@...ck.org>, Peter Xu <peterx@...hat.com>,
        Alex Williamson <alex.williamson@...hat.com>,
        Andrea Arcangeli <aarcange@...hat.com>,
        David Hildenbrand <david@...hat.com>, Jan Kara <jack@...e.cz>,
        "Kirill A. Shutemov" <kirill.shutemov@...ux.intel.com>
Subject: Re: Weird code with change "mm/gup: clean up follow_pfn_pte()
 slightly"

On 2/3/22 17:06, Jason Gunthorpe wrote:
> On Thu, Feb 03, 2022 at 04:59:56PM -0800, John Hubbard wrote:
>> On 2/3/22 16:45, Jason Gunthorpe wrote:
>>> On Thu, Feb 03, 2022 at 12:44:57PM -0800, John Hubbard wrote:
>>>> On 2/3/22 05:01, Jason Gunthorpe wrote:
>>>> ...
>>>>>>> In the new branch if (pages), you set page = ERR_PTR(-EFAULT) and goto
>>>>>>> out. However, at the label out, the value of page is not used, but the
>>>>>>> return uses the variables i and ret.
>>>>>>
>>>>>> Yes, I think that the complaint is accurate. The intent of this code is
>>>>>> to return either number of pages so far (i) or ret (which should be zero
>>>>>> in this case), because we are just stopping early, rather than calling
>>>>>> this an actual error.
>>>>>
>>>>> IIRC GUP shouldn't return 0, it should return an error code, not zero.
>>>>>
>>>>> Jason
>>>>
>>>> Errors work for single pages, but GUP is a multi-page API call. If it
>>>> returned an error part way through the list of pages, then callers would
>>>> have no way of knowing how many pages to release.
>>>
>>> Yes, but that is returning a positive error code, I said it should not
>>> return zero.
>>>
>>> When it hits an error with pages already loaded it returns that number
>>> and the caller will then do gup once more with the VA pointing at the
>>> problematic page. Then GUP can return the error code because it has 0
>>> pages on the next iteration.
>>>
>>> It should not return 0 here when it got an error.
>>
>> This is perhaps better API design, but it's not what exists now.
> 
> I think it is what exists today, 0 certainly is not implemented as
> 'need retry' anywhere I found.
> 
> So why do we return 0, if it means an error, instead of returning the
> actual errno?

Well, now returning 0 sounds all wrong, when you put it like that. :)

So, simply this approach? :

@@ -1205,8 +1201,15 @@ static long __get_user_pages(struct mm_struct *mm,
  		} else if (PTR_ERR(page) == -EEXIST) {
  			/*
  			 * Proper page table entry exists, but no corresponding
-			 * struct page.
+			 * struct page. If the caller expects **pages to be
+			 * filled in, bail out now, because that can't be done
+			 * for this page.
  			 */
+			if (pages) {
+				ret = PTR_ERR(page);
+				goto out;
+			}
+
  			goto next_page;
  		} else if (IS_ERR(page)) {
  			ret = PTR_ERR(page);

> 
>> The call sites today handle 0 pages ret value correctly,
> 
> This isn't correct though:
> 
>   	if (get_user_pages(vaddr, 1, write ? FOLL_WRITE : 0, &page, NULL) <= 0)
>   		return -EFAULT;
> 
> If GUP wanted the caller to permanently fail with -EFAULT, it should
> have directly returned EFAULT.
> 
> 0 means 'to be retried', whatever that means, and there is no retry
> in the above.
> 
> IOW, the above does not handle a 0 return correctly, according to the
> comment.
> 

I recall seeing several sites that do a quick attempt at one page and
force a -errno failure if anything other than ret==1 occurs. I guess the
good news is that changing GUP to return -errno instead of 0 won't affect
them.


thanks,
-- 
John Hubbard
NVIDIA

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ