lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <ZIdG7rMDY6649hSp@x1n>
Date:   Mon, 12 Jun 2023 12:25:18 -0400
From:   Peter Xu <peterx@...hat.com>
To:     Khalid Aziz <khalid.aziz@...cle.com>
Cc:     akpm@...ux-foundation.org, willy@...radead.org,
        markhemm@...glemail.com, viro@...iv.linux.org.uk, david@...hat.com,
        mike.kravetz@...cle.com, andreyknvl@...il.com,
        dave.hansen@...el.com, luto@...nel.org, brauner@...nel.org,
        arnd@...db.de, ebiederm@...ssion.com, catalin.marinas@....com,
        linux-arch@...r.kernel.org, linux-kernel@...r.kernel.org,
        linux-mm@...ck.org, mhiramat@...nel.org, rostedt@...dmis.org,
        vasily.averin@...ux.dev, xhao@...ux.alibaba.com, pcc@...gle.com,
        neilb@...e.de, maz@...nel.org
Subject: Re: [PATCH RFC v2 0/4]  Add support for sharing page tables across
 processes (Previously mshare)

On Wed, Apr 26, 2023 at 10:49:47AM -0600, Khalid Aziz wrote:
> This patch series adds a new flag to mmap() call - MAP_SHARED_PT.

Since hugetlb has this, it'll be very helpful if you can provide a
comparison between this approach and hugetlb's - especially on the
differences - and reasonings about those.

Merging anything like this definitely should also pave way for hugetlb's
future, so it even seems to be an "requirement" of such patchset even
though implicitily..  considering the "hotness" that hugetlb recently has
on refactoring demand (if not a rewrite).

Some high level questions:

  - Why mmap() flag?

    For this one, I agree it should be opt-in - sharing pgtable definitely
    means sharing of a lot of privileges operating on current mm, so one
    should be aware and be prepared to be messed up.

    IIUC hugetlb doesn't do this but instead when something "racy" happens
    itt just unshares by default.  To me opt-in makes more sense if to
    start from zero, because I don't think by default a process wants to
    leak its mm to others.

    I think you mentioned allowing pgtable to be shared even for mprotect()
    from one MM then all MMs can see; but if so then DONTNEED should really
    do the same - when one MM DONTNEED it it should go away for all.  It
    doesn't make a lot of sense to me to treat it differently with a
    DONTNEED refcount anywhere..

  - Can guest MM opt-out?

    Should we allow guest MM to opt-out when they want?  It sounds like a
    good thing to have to me, especially for file that sounds like as
    simple as zapping the pgtable.  But then mmap flag will stop working
    iiuc, so goes back to that question (e.g. what about madvise or prctl?).

  - Why mm_struct to represent shared pgtable?

    IIUC hugetlb used the pgtable page itself plus some refcounting (the
    refcounting is racy with gup-fast that Jann used to point out, but
    that's another story..).  My question is do you think that won't work?
    Are there reasons to explain?  Why mm_struct is the structure you chose
    for representing a shared pgtable?  Why per-file?

Thanks,

-- 
Peter Xu

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ