linux-kernel - Re: [PATCH v1 06/11] mm: support GUP-triggered unsharing via FAULT_FLAG

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <C469C1C9-1362-4DD3-9106-2765D94C6350@vmware.com>
Date:   Sat, 18 Dec 2021 05:23:28 +0000
From:   Nadav Amit <namit@...are.com>
To:     Matthew Wilcox <willy@...radead.org>
CC:     Linus Torvalds <torvalds@...ux-foundation.org>,
        David Hildenbrand <david@...hat.com>,
        Linux Kernel Mailing List <linux-kernel@...r.kernel.org>,
        Andrew Morton <akpm@...ux-foundation.org>,
        Hugh Dickins <hughd@...gle.com>,
        David Rientjes <rientjes@...gle.com>,
        Shakeel Butt <shakeelb@...gle.com>,
        John Hubbard <jhubbard@...dia.com>,
        Jason Gunthorpe <jgg@...dia.com>,
        Mike Kravetz <mike.kravetz@...cle.com>,
        Mike Rapoport <rppt@...ux.ibm.com>,
        Yang Shi <shy828301@...il.com>,
        "Kirill A . Shutemov" <kirill.shutemov@...ux.intel.com>,
        Vlastimil Babka <vbabka@...e.cz>, Jann Horn <jannh@...gle.com>,
        Michal Hocko <mhocko@...nel.org>,
        Rik van Riel <riel@...riel.com>,
        Roman Gushchin <guro@...com>,
        Andrea Arcangeli <aarcange@...hat.com>,
        Peter Xu <peterx@...hat.com>,
        Donald Dutile <ddutile@...hat.com>,
        Christoph Hellwig <hch@....de>,
        Oleg Nesterov <oleg@...hat.com>, Jan Kara <jack@...e.cz>,
        Linux-MM <linux-mm@...ck.org>,
        "open list:KERNEL SELFTEST FRAMEWORK" 
        <linux-kselftest@...r.kernel.org>,
        "open list:DOCUMENTATION" <linux-doc@...r.kernel.org>
Subject: Re: [PATCH v1 06/11] mm: support GUP-triggered unsharing via
 FAULT_FLAG_UNSHARE (!hugetlb)



> On Dec 17, 2021, at 9:03 PM, Matthew Wilcox <willy@...radead.org> wrote:
> 
> On Sat, Dec 18, 2021 at 04:52:13AM +0000, Nadav Amit wrote:
>> Take for instance memcached and assume you overcommit memory with a very fast
>> swap (e.g., pmem, zram, perhaps even slower). Now, it turns out memcached
>> often accesses a page first for read and shortly after for write. I
>> encountered, in a similar scenario, that the page reference that
>> lru_cache_add() takes during the first faultin event (for read), causes a COW
>> on a write page-fault that happens shortly after [1]. So on memcached I
>> assume this would also trigger frequent unnecessary COWs.
> 
> Why are we comparing page_count() against 1 and not 1 + PageLRU(page)?
> Having a reference from the LRU should be expected.  Is it because of
> some race that we'd need to take the page lock to protect against?
> 

IIUC, the reference that is taken on the page is taken before SetPageLRU()
is called and the reference is later dropped:

lru_add_drain()
 lru_add_drain_cpu()
  __pagevec_lru_add()
   __pagevec_lru_add_fn()
    __pagevec_lru_add_fn()
     SetPageLRU()		<- sets the LRU
  release_pages()		<- drops the reference

It is one scenario I encountered. There might be others that take transient
references on pages that cause unnecessary COWs. I think David and Andrea
had few in mind. To trigger a COW bug I once used mlock()/munlock() that
take such transient reference. But who knows how many other cases exist
(KSM? vmscan?)