lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <YRqPJ6c8OQpD6HD5@casper.infradead.org>
Date:   Mon, 16 Aug 2021 17:15:35 +0100
From:   Matthew Wilcox <willy@...radead.org>
To:     Khalid Aziz <khalid.aziz@...cle.com>
Cc:     David Hildenbrand <david@...hat.com>,
        "Longpeng (Mike, Cloud Infrastructure Service Product Dept.)" 
        <longpeng2@...wei.com>, Steven Sistare <steven.sistare@...cle.com>,
        Anthony Yznaga <anthony.yznaga@...cle.com>,
        "linux-kernel@...r.kernel.org" <linux-kernel@...r.kernel.org>,
        "linux-mm@...ck.org" <linux-mm@...ck.org>,
        "Gonglei (Arei)" <arei.gonglei@...wei.com>
Subject: Re: [RFC PATCH 0/5] madvise MADV_DOEXEC

On Mon, Aug 16, 2021 at 10:06:47AM -0600, Khalid Aziz wrote:
> On 8/16/21 9:59 AM, Matthew Wilcox wrote:
> > On Mon, Aug 16, 2021 at 05:01:44PM +0200, David Hildenbrand wrote:
> > > On 16.08.21 16:40, Matthew Wilcox wrote:
> > > > On Mon, Aug 16, 2021 at 04:33:09PM +0200, David Hildenbrand wrote:
> > > > > > > I did not follow why we have to play games with MAP_PRIVATE, and having
> > > > > > > private anonymous pages shared between processes that don't COW, introducing
> > > > > > > new syscalls etc.
> > > > > > 
> > > > > > It's not about SHMEM, it's about file-backed pages on regular
> > > > > > filesystems.  I don't want to have XFS, ext4 and btrfs all with their
> > > > > > own implementations of ARCH_WANT_HUGE_PMD_SHARE.
> > > > > 
> > > > > Let me ask this way: why do we have to play such games with MAP_PRIVATE?
> > > > 
> > > > : Mappings within this address range behave as if they were shared
> > > > : between threads, so a write to a MAP_PRIVATE mapping will create a
> > > > : page which is shared between all the sharers.
> > > > 
> > > > If so, that's a misunderstanding, because there are no games being played.
> > > > What Khalid's saying there is that because the page tables are already
> > > > shared for that range of address space, the COW of a MAP_PRIVATE will
> > > > create a new page, but that page will be shared between all the sharers.
> > > > The second write to a MAP_PRIVATE page (by any of the sharers) will not
> > > > create a COW situation.  Just like if all the sharers were threads of
> > > > the same process.
> > > > 
> > > 
> > > It actually seems to be just like I understood it. We'll have multiple
> > > processes share anonymous pages writable, even though they are not using
> > > shared memory.
> > > 
> > > IMHO, sharing page tables to optimize for something kernel-internal (page
> > > table consumption) should be completely transparent to user space. Just like
> > > ARCH_WANT_HUGE_PMD_SHARE currently is unless I am missing something
> > > important.
> > > 
> > > The VM_MAYSHARE check in want_pmd_share()->vma_shareable() makes me assume
> > > that we really only optimize for MAP_SHARED right now, never for
> > > MAP_PRIVATE.
> > 
> > It's definitely *not* about being transparent to userspace.  It's about
> > giving userspace new functionality where multiple processes can choose
> > to share a portion of their address space with each other.  What any
> > process changes in that range changes, every sharing process sees.
> > mmap(), munmap(), mprotect(), mremap(), everything.
> > 
> 
> Exactly and to further elaborate, once a process calls mshare() to declare
> its intent to share PTEs for a range of address and another process accepts
> that sharing by calling mshare() itself, the two (or more) processes have
> agreed to share PTEs for that entire address range. A MAP_PRIVATE mapping in
> this address range goes against the original intent of sharing and what we
> are saying is the original intent of sharing takes precedence in case of
> this conflict.

I don't know that it's against the original intent ... I think
MAP_PRIVATE in this context means "Private to this process and every
process sharing this chunk of address space".  So a store doesn't go
through to the page cache, as it would with MAP_SHARED, but it is
visible to the other processes sharing these page tables.

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ