lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <b7bb60b6-986d-02c5-e58a-d249c0185092@nvidia.com>
Date:   Wed, 14 Jul 2021 12:43:38 -0700
From:   John Hubbard <jhubbard@...dia.com>
To:     Matthew Wilcox <willy@...radead.org>,
        Miaohe Lin <linmiaohe@...wei.com>
CC:     Michal Hocko <mhocko@...e.com>, Yu Zhao <yuzhao@...gle.com>,
        <akpm@...ux-foundation.org>, <hannes@...xchg.org>,
        <vbabka@...e.cz>, <axboe@...nel.dk>, <iamjoonsoo.kim@....com>,
        <alexs@...nel.org>, <apopple@...dia.com>, <minchan@...nel.org>,
        <david@...hat.com>, <shli@...com>, <hillf.zj@...baba-inc.com>,
        <linux-mm@...ck.org>, <linux-kernel@...r.kernel.org>
Subject: Re: [PATCH 1/5] mm/vmscan: put the redirtied MADV_FREE pages back to
 anonymous LRU list

On 7/14/21 4:48 AM, Matthew Wilcox wrote:
> On Wed, Jul 14, 2021 at 07:36:57PM +0800, Miaohe Lin wrote:
>> On 2021/7/13 21:34, Matthew Wilcox wrote:
>>> On Tue, Jul 13, 2021 at 09:13:51PM +0800, Miaohe Lin wrote:
>>>>>> When the MADV_FREE pages are redirtied before they could be reclaimed, the pages
>>>>>> should be put back to anonymous LRU list by setting SwapBacked flag, thus the
>>>>>> pages will be reclaimed in normal swapout way.
>>>>>
>>>>> Agreed. But the question is why this needs an explicit handling here
>>>>> when we already do handle this case when trying to unmap the page.
>>>>
>>>> This makes me think more. It seems even the page_ref_freeze call is guaranteed to
>>>> success as no one can grab the page refcnt after the page is successfully unmapped.
>>>
>>> NO!  This is wrong.  Every page can have its refcount speculatively raised
>>> (and then lowered).  The two prime candidates for this are lockless GUP
>>> and page cache lookups, but there can be others too.
>>>
>>
>> Many thanks for pointing this out. My overlook! Sorry!
>> So, it seems lockless GUP can redirty the MADV_FREE page. But is it ok to just release
>> a redirtied MADV_FREE pages? Because we hold the last reference here and the page will
>> be freed anyway...
> 
> I don't see how lockless GUP can redirty the page.  It can grab the
> refcount, thus making the refcount here two.  Then the call to freeze
> here fails and the page stays on the list.  But the lockless GUP checks
> the page is still in the page table (and discovers it isn't, so releases
> the reference count).  Am I missing a path that lets lockless GUP dirty
> the page?
> 

If a device driver pins some pages using gup, and the device then uses dma
to write to those pages, then you could get there. That story is part of the
reasoning that led to creating pin_user_pages(), which btw does not yet
fully solve that case.

Basically, though, unless a non-CPU device has access to the page, it's
hard to see how gup itself can lead to a page getting dirtied.

thanks,
-- 
John Hubbard
NVIDIA

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ