lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Message-ID: <87zggk5njh.fsf@nvdebian.thelocal>
Date:   Thu, 04 Aug 2022 19:57:21 +1000
From:   Alistair Popple <apopple@...dia.com>
To:     David Hildenbrand <david@...hat.com>
Cc:     Andrew Morton <akpm@...ux-foundation.org>, linux-mm@...ck.org,
        jgg@...dia.com, minchan@...nel.org, linux-kernel@...r.kernel.org,
        jhubbard@...dia.com, pasha.tatashin@...een.com
Subject: Re: [PATCH v2] mm/gup.c: Simplify and fix
 check_and_migrate_movable_pages() return codes


David Hildenbrand <david@...hat.com> writes:

> On 04.08.22 02:12, Alistair Popple wrote:
>>
>> Andrew Morton <akpm@...ux-foundation.org> writes:
>>
>>> On Tue,  2 Aug 2022 10:30:12 +1000 Alistair Popple <apopple@...dia.com> wrote:
>>>
>>>> When pinning pages with FOLL_LONGTERM check_and_migrate_movable_pages()
>>>> is called to migrate pages out of zones which should not contain any
>>>> longterm pinned pages.
>>>>
>>>> When migration succeeds all pages will have been unpinned so pinning
>>>> needs to be retried. This is indicated by returning zero. When all pages
>>>> are in the correct zone the number of pinned pages is returned.
>>>>
>>>> However migration can also fail, in which case pages are unpinned and
>>>> -ENOMEM is returned. However if the failure was due to not being unable
>>>> to isolate a page zero is returned. This leads to indefinite looping in
>>>> __gup_longterm_locked().
>>>>
>>>> Fix this by simplifying the return codes such that zero indicates all
>>>> pages were successfully pinned in the correct zone while errors indicate
>>>> either pages were migrated and pinning should be retried or that
>>>> migration has failed and therefore the pinning operation should fail.
>>>>
>>>> This fixes the indefinite looping on page isolation failure by failing
>>>> the pin operation instead of retrying indefinitely.
>>>>
>>>
>>> Are we able to identify a Fixes: for this?  Presumably something in the
>>> series "Add MEMORY_DEVICE_COHERENT for coherent device memory mapping"?
>>
>> It seems the infinite loop was desired behaviour so I will re-spin this
>> as a pure clean-up.
>>
>
> How can the infinite loop trigger when we allow longterm-pinning the
> shared zeropage? (note: disallowing that for now was a bug)

Right, I don't know of any other triggers so based on the discussion
Pasha pointed me at I think the infinite loop is probably fine unless
there are other bugs.

Apologies I should have copied you on the new version which is just a
clean-up now -
https://lore.kernel.org/linux-mm/20220804032241.859891-1-apopple@nvidia.com/

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ