lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <CAJuCfpG5OArpzOBgsy7DvrL4m-Z97SgyrdbnAk8sYogqdwvWEw@mail.gmail.com>
Date:   Thu, 28 Sep 2023 12:47:21 -0700
From:   Suren Baghdasaryan <surenb@...gle.com>
To:     Peter Xu <peterx@...hat.com>
Cc:     David Hildenbrand <david@...hat.com>, Jann Horn <jannh@...gle.com>,
        akpm@...ux-foundation.org, viro@...iv.linux.org.uk,
        brauner@...nel.org, shuah@...nel.org, aarcange@...hat.com,
        lokeshgidra@...gle.com, hughd@...gle.com, mhocko@...e.com,
        axelrasmussen@...gle.com, rppt@...nel.org, willy@...radead.org,
        Liam.Howlett@...cle.com, zhangpeng362@...wei.com,
        bgeffon@...gle.com, kaleshsingh@...gle.com, ngeoffray@...gle.com,
        jdduke@...gle.com, linux-mm@...ck.org,
        linux-fsdevel@...r.kernel.org, linux-kernel@...r.kernel.org,
        linux-kselftest@...r.kernel.org, kernel-team@...roid.com
Subject: Re: [PATCH v2 2/3] userfaultfd: UFFDIO_REMAP uABI

On Thu, Sep 28, 2023 at 11:34 AM Peter Xu <peterx@...hat.com> wrote:
>
> On Thu, Sep 28, 2023 at 07:51:18PM +0200, David Hildenbrand wrote:
> > On 28.09.23 19:21, Peter Xu wrote:
> > > On Thu, Sep 28, 2023 at 07:05:40PM +0200, David Hildenbrand wrote:
> > > > As described as reply to v1, without fork() and KSM, the PAE bit should
> > > > stick around. If that's not the case, we should investigate why.
> > > >
> > > > If we ever support the post-fork case (which the comment above remap_pages()
> > > > excludes) we'll need good motivation why we'd want to make this
> > > > overly-complicated feature even more complicated.
> > >
> > > The problem is DONTFORK is only a suggestion, but not yet restricted.  If
> > > someone reaches on top of some !PAE page on src it'll never gonna proceed
> > > and keep failing, iiuc.
> >
> > Yes. It won't work if you fork() and not use DONTFORK on the src VMA. We
> > should document that as a limitation.
> >
> > For example, you could return an error to the user that can just call
> > UFFDIO_COPY. (or to the UFFDIO_COPY from inside uffd code, but that's
> > probably ugly as well).
>
> We could indeed provide some special errno perhaps upon the PAE check, then
> document it explicitly in the man page and suggest resolutions (like
> DONTFORK) when user hit it.
>
> >
> > >
> > > do_wp_page() doesn't have that issue of accuracy only because one round of
> > > CoW will just allocate a new page with PAE set guaranteed, which is pretty
> > > much self-heal and unnoticed.
> >
> > Yes. But it might have to copy, at which point the whole optimization of
> > remap is gone :)
>
> Right, but that's fine IMHO because it should still be very corner case,
> definitely not expected to be the majority to start impact the performance
> results.
>
> >
> > >
> > > So it'll be great if we can have similar self-heal way for PAE.  If not, I
> > > think it's still fine we just always fail on !PAE src pages, but then maybe
> > > we should let the user know what's wrong, e.g., the user can just forgot to
> > > apply DONTFORK then forked.  And then the user hits error and don't know
> > > what happened.  Probably at least we should document it well in man pages.
> > >
> > Yes, exactly.
> >
> > > Another option can be we keep using folio_mapcount() for pte, and another
> > > helper (perhaps: _nr_pages_mapped==COMPOUND_MAPPED && _entire_mapcount==1)
> > > for thp.  But I know that's not ideal either.
> >
> > As long as we only set the pte writable if PAE is set, we're good from a CVE
> > perspective. The other part is just simplicity of avoiding all these
> > mapcount+swapcount games where possible.
> >
> > (one day folio_mapcount() might be faster -- I'm still working on that patch
> > in the bigger picture of handling PTE-mapped THP better)
>
> Sure.
>
> For now as long as we're crystal clear on the possibility of inaccuracy of
> PAE, it never hits besides fork() && !DONTFORK, and properly document it,
> then sounds good here.

Ok, sounds like we have a consensus. I'll prepare manpage changes to
document the DONTFORK requirement for uffd_remap.

>
> Thanks,
>
> --
> Peter Xu
>

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ