linux-kernel - Re: [PATCH v2 1/2] mm/gup: stop leaking pinned pages in low memory conditions

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <ceeb9dd7-bef9-40c8-aead-c1325f1e3a3d@nvidia.com>
Date: Fri, 18 Oct 2024 10:46:09 -0700
From: John Hubbard <jhubbard@...dia.com>
To: David Hildenbrand <david@...hat.com>,
 Andrew Morton <akpm@...ux-foundation.org>
Cc: LKML <linux-kernel@...r.kernel.org>, linux-mm@...ck.org,
 Alistair Popple <apopple@...dia.com>, Shigeru Yoshida <syoshida@...hat.com>,
 Jason Gunthorpe <jgg@...dia.com>, Minchan Kim <minchan@...nel.org>,
 Pasha Tatashin <pasha.tatashin@...een.com>
Subject: Re: [PATCH v2 1/2] mm/gup: stop leaking pinned pages in low memory
 conditions

On 10/18/24 12:47 AM, David Hildenbrand wrote:
> On 18.10.24 03:17, John Hubbard wrote:
>> If a driver tries to call any of the pin_user_pages*(FOLL_LONGTERM)
>> family of functions, and requests "too many" pages, then the call will
>> erroneously leave pages pinned. This is visible in user space as an
>> actual memory leak.
>>
>> Repro is trivial: just make enough pin_user_pages(FOLL_LONGTERM) calls
>> to exhaust memory.
>>
>> The root cause of the problem is this sequence, within
>> __gup_longterm_locked():
>>
>>      __get_user_pages_locked()
>>      rc = check_and_migrate_movable_pages()
>>
>> ...which gets retried in a loop. The loop error handling is incomplete,
>> clearly due to a somewhat unusual and complicated tri-state error API.
>> But anyway, if -ENOMEM, or in fact, any unexpected error is returned
>> from check_and_migrate_movable_pages(), then __gup_longterm_locked()
>> happily returns the error, while leaving the pages pinned.
> 
> Sorry for another comment, I am taking my time to look into the code 
> again in more detail ...
> 
> migrate_longterm_unpinnable_folios() will always unpin all pages: no 
> matter which error it returns.
> 
> a) If it returns -EAGAIN, it unpinned all folios
> b) If it returns any error it first calls unpin_folios().
> 
> So shouldn't the fix just be in check_and_migrate_movable_pages()?

OK, sure. It's a little odd from a layering point of view, because the 
callee
"helpfully" unpins the pages for you (wheee!), but the updated comment
highlights that, at least.

And actually this whole thing of "pin the pages, just for a short time, even
though you're not allowed to" is partly why this area is so entertaining.

> 
> diff --git a/mm/gup.c b/mm/gup.c
> index a82890b46a36..81fc8314e687 100644
> --- a/mm/gup.c
> +++ b/mm/gup.c
> @@ -2403,8 +2403,9 @@ static int migrate_longterm_unpinnable_folios(
>    * -EAGAIN. The caller should re-pin the entire range with FOLL_PIN 
> and then
>    * call this routine again.
>    *
> - * If an error other than -EAGAIN occurs, this indicates a migration 
> failure.
> - * The caller should give up, and propagate the error back up the call 
> stack.
> + * If an error occurs, all folios are unpinned. If an error other than
> + * -EAGAIN occurs, this indicates a migration failure. The caller 
> should give
> + * up, and propagate the error back up the call stack.
>    *
>    * If everything is OK and all folios in the range are allowed to be 
> pinned,
>    * then this routine leaves all folios pinned and returns zero for 
> success.
> @@ -2437,8 +2438,10 @@ static long 
> check_and_migrate_movable_pages(unsigned long nr_pages,
>          long i, ret;
> 
>          folios = kmalloc_array(nr_pages, sizeof(*folios), GFP_KERNEL);
> -       if (!folios)
> +       if (!folios) {
> +               unpin_user_pages(pages, nr_pages);
>                  return -ENOMEM;
> +       }
> 
>          for (i = 0; i < nr_pages; i++)
>                  folios[i] = page_folio(pages[i]);
> 
> 
> 
> Then, check_and_migrate_movable_pages() will never return with an error and
> having folios pinned.
> 
> 
> If check_and_migrate_movable_pages() -> check_and_migrate_movable_folios()
> returns "0", all folios remain pinned an no harm is done.
> 
> 
> Consequently, I think patch #2 is not really required, because it doesn't
> perform the temporary allocation that could fail with -ENOMEM.
> 

Yes!

> 
> Sorry for taking a closer look only now ...
> 

It's all still in review, so the timing is perfectly fine. I really
appreciate the closer look, it's definitely making things better.


thanks,
-- 
John Hubbard