[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <87ttdbofpw.fsf@nvdebian.thelocal>
Date: Thu, 17 Oct 2024 09:13:32 +1100
From: Alistair Popple <apopple@...dia.com>
To: John Hubbard <jhubbard@...dia.com>
Cc: Andrew Morton <akpm@...ux-foundation.org>, LKML
<linux-kernel@...r.kernel.org>, linux-mm@...ck.org, Shigeru Yoshida
<syoshida@...hat.com>, David Hildenbrand <david@...hat.com>, Jason
Gunthorpe <jgg@...dia.com>, Minchan Kim <minchan@...nel.org>, Pasha
Tatashin <pasha.tatashin@...een.com>
Subject: Re: [PATCH] mm/gup: stop leaking pinned pages in low memory conditions
John Hubbard <jhubbard@...dia.com> writes:
> If a driver tries to call any of the pin_user_pages*(FOLL_LONGTERM)
> family of functions, and requests "too many" pages, then the call will
> erroneously leave pages pinned. This is visible in user space as an
> actual memory leak.
>
> Repro is trivial: just make enough pin_user_pages(FOLL_LONGTERM) calls
> to exhaust memory.
>
> The root cause of the problem is this sequence, within
> __gup_longterm_locked():
>
> __get_user_pages_locked()
> rc = check_and_migrate_movable_pages()
>
> ...which gets retried in a loop. The loop error handling is incomplete,
> clearly due to a somewhat unusual and complicated tri-state error API.
> But anyway, if -ENOMEM, or in fact, any unexpected error is returned
> from check_and_migrate_movable_pages(), then __gup_longterm_locked()
> happily returns the error, while leaving the pages pinned.
>
> In the failed case, which is an app that requests (via a device driver)
> 30720000000 bytes to be pinned, and then exits, I see this:
>
> $ grep foll /proc/vmstat
> nr_foll_pin_acquired 7502048
> nr_foll_pin_released 2048
>
> And after applying this patch, it returns to balanced pins:
>
> $ grep foll /proc/vmstat
> nr_foll_pin_acquired 7502048
> nr_foll_pin_released 7502048
>
> Fix this by unpinning the pages that __get_user_pages_locked() has
> pinned, in such error cases.
>
> Fixes: 24a95998e9ba ("mm/gup.c: simplify and fix check_and_migrate_movable_pages() return codes")
> Cc: Alistair Popple <apopple@...dia.com>
> Cc: Shigeru Yoshida <syoshida@...hat.com>
> Cc: David Hildenbrand <david@...hat.com>
> Cc: Jason Gunthorpe <jgg@...dia.com>
> Cc: Minchan Kim <minchan@...nel.org>
> Cc: Pasha Tatashin <pasha.tatashin@...een.com>
> Signed-off-by: John Hubbard <jhubbard@...dia.com>
> ---
> mm/gup.c | 11 +++++++++++
> 1 file changed, 11 insertions(+)
>
> diff --git a/mm/gup.c b/mm/gup.c
> index a82890b46a36..24acf53c8294 100644
> --- a/mm/gup.c
> +++ b/mm/gup.c
> @@ -2492,6 +2492,17 @@ static long __gup_longterm_locked(struct mm_struct *mm,
>
> /* FOLL_LONGTERM implies FOLL_PIN */
> rc = check_and_migrate_movable_pages(nr_pinned_pages, pages);
> +
> + /*
> + * The __get_user_pages_locked() call happens before we know
> + * that whether it's possible to successfully complete the whole
> + * operation. To compensate for this, if we get an unexpected
> + * error (such as -ENOMEM) then we must unpin everything, before
> + * erroring out.
> + */
> + if (rc != -EAGAIN && rc != 0)
> + unpin_user_pages(pages, nr_pinned_pages);
I know this is going to fall out of the loop in the next line but given
this is an error handling case it would be nice if this was made
explicit here with a break statement. It looks correct to me though so:
Reviewed-by: Alistair Popple <apopple@...dia.com>
> +
> } while (rc == -EAGAIN);
> memalloc_pin_restore(flags);
> return rc ? rc : nr_pinned_pages;
Powered by blists - more mailing lists