lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Date:   Thu, 3 Feb 2022 21:06:12 -0400
From:   Jason Gunthorpe <jgg@...dia.com>
To:     John Hubbard <jhubbard@...dia.com>
Cc:     Lukas Bulwahn <lukas.bulwahn@...il.com>,
        Andrew Morton <akpm@...ux-foundation.org>,
        Linux Kernel Mailing List <linux-kernel@...r.kernel.org>,
        Linux-MM <linux-mm@...ck.org>, Peter Xu <peterx@...hat.com>,
        Alex Williamson <alex.williamson@...hat.com>,
        Andrea Arcangeli <aarcange@...hat.com>,
        David Hildenbrand <david@...hat.com>, Jan Kara <jack@...e.cz>,
        "Kirill A. Shutemov" <kirill.shutemov@...ux.intel.com>
Subject: Re: Weird code with change "mm/gup: clean up follow_pfn_pte()
 slightly"

On Thu, Feb 03, 2022 at 04:59:56PM -0800, John Hubbard wrote:
> On 2/3/22 16:45, Jason Gunthorpe wrote:
> > On Thu, Feb 03, 2022 at 12:44:57PM -0800, John Hubbard wrote:
> > > On 2/3/22 05:01, Jason Gunthorpe wrote:
> > > ...
> > > > > > In the new branch if (pages), you set page = ERR_PTR(-EFAULT) and goto
> > > > > > out. However, at the label out, the value of page is not used, but the
> > > > > > return uses the variables i and ret.
> > > > > 
> > > > > Yes, I think that the complaint is accurate. The intent of this code is
> > > > > to return either number of pages so far (i) or ret (which should be zero
> > > > > in this case), because we are just stopping early, rather than calling
> > > > > this an actual error.
> > > > 
> > > > IIRC GUP shouldn't return 0, it should return an error code, not zero.
> > > > 
> > > > Jason
> > > 
> > > Errors work for single pages, but GUP is a multi-page API call. If it
> > > returned an error part way through the list of pages, then callers would
> > > have no way of knowing how many pages to release.
> > 
> > Yes, but that is returning a positive error code, I said it should not
> > return zero.
> > 
> > When it hits an error with pages already loaded it returns that number
> > and the caller will then do gup once more with the VA pointing at the
> > problematic page. Then GUP can return the error code because it has 0
> > pages on the next iteration.
> > 
> > It should not return 0 here when it got an error.
> 
> This is perhaps better API design, but it's not what exists now. 

I think it is what exists today, 0 certainly is not implemented as
'need retry' anywhere I found.

So why do we return 0, if it means an error, instead of returning the
actual errno?

> The call sites today handle 0 pages ret value correctly,

This isn't correct though:

 	if (get_user_pages(vaddr, 1, write ? FOLL_WRITE : 0, &page, NULL) <= 0)
 		return -EFAULT;

If GUP wanted the caller to permanently fail with -EFAULT, it should
have directly returned EFAULT.

0 means 'to be retried', whatever that means, and there is no retry
in the above.

IOW, the above does not handle a 0 return correctly, according to the
comment.

Jason

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ