lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <YKOR/8LzEaOgCvio@dhcp22.suse.cz>
Date:   Tue, 18 May 2021 12:07:59 +0200
From:   Michal Hocko <mhocko@...e.com>
To:     David Hildenbrand <david@...hat.com>
Cc:     linux-kernel@...r.kernel.org, linux-mm@...ck.org,
        Andrew Morton <akpm@...ux-foundation.org>,
        Arnd Bergmann <arnd@...db.de>,
        Oscar Salvador <osalvador@...e.de>,
        Matthew Wilcox <willy@...radead.org>,
        Andrea Arcangeli <aarcange@...hat.com>,
        Minchan Kim <minchan@...nel.org>, Jann Horn <jannh@...gle.com>,
        Jason Gunthorpe <jgg@...pe.ca>,
        Dave Hansen <dave.hansen@...el.com>,
        Hugh Dickins <hughd@...gle.com>,
        Rik van Riel <riel@...riel.com>,
        "Michael S . Tsirkin" <mst@...hat.com>,
        "Kirill A . Shutemov" <kirill.shutemov@...ux.intel.com>,
        Vlastimil Babka <vbabka@...e.cz>,
        Richard Henderson <rth@...ddle.net>,
        Ivan Kokshaysky <ink@...assic.park.msu.ru>,
        Matt Turner <mattst88@...il.com>,
        Thomas Bogendoerfer <tsbogend@...ha.franken.de>,
        "James E.J. Bottomley" <James.Bottomley@...senpartnership.com>,
        Helge Deller <deller@....de>, Chris Zankel <chris@...kel.net>,
        Max Filippov <jcmvbkbc@...il.com>,
        Mike Kravetz <mike.kravetz@...cle.com>,
        Peter Xu <peterx@...hat.com>,
        Rolf Eike Beer <eike-kernel@...tec.de>,
        linux-alpha@...r.kernel.org, linux-mips@...r.kernel.org,
        linux-parisc@...r.kernel.org, linux-xtensa@...ux-xtensa.org,
        linux-arch@...r.kernel.org, Linux API <linux-api@...r.kernel.org>
Subject: Re: [PATCH resend v2 2/5] mm/madvise: introduce
 MADV_POPULATE_(READ|WRITE) to prefault page tables

[sorry for a long silence on this]

On Tue 11-05-21 10:15:31, David Hildenbrand wrote:
[...]

Thanks for the extensive usecase description. That is certainly useful
background. I am sorry to bring this up again but I am still not
convinced that READ/WRITE variant are the best interface.
 
> While the use case for MADV_POPULATE_WRITE is fairly obvious (i.e.,
> preallocate memory and prefault page tables for VMs), one issue is that
> whenever we prefault pages writable, the pages have to be marked dirty,
> because the CPU could dirty them any time. while not a real problem for
> hugetlbfs or dax/pmem, it can be a problem for shared file mappings: each
> page will be marked dirty and has to be written back later when evicting.
> 
> MADV_POPULATE_READ allows for optimizing this scenario: Pre-read a whole
> mapping from backend storage without marking it dirty, such that eviction
> won't have to write it back. As discussed above, shared file mappings
> might require an explciit fallocate() upfront to achieve
> preallcoation+prepopulation.

This means that you want to have two different uses depending on the
underlying mapping type. MADV_POPULATE_READ seems rather weak for
anonymous/private mappings. Memory backed by zero pages seems rather
unhelpful as the PF would need to do all the heavy lifting anyway.
Or is there any actual usecase when this is desirable?

So the split into these two modes seems more like gup interface
shortcomings bubbling up to the interface. I do expect userspace only
cares about pre-faulting the address range. No matter what the backing
storage is. 

Or do I still misunderstand all the usecases?
-- 
Michal Hocko
SUSE Labs

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ