lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <CAJHvVcgfN5RVXJ_f3tN2UinV_kWCMyCY_g5oKm=BtgQJz-e7gA@mail.gmail.com>
Date:   Mon, 10 Jul 2023 10:19:54 -0700
From:   Axel Rasmussen <axelrasmussen@...gle.com>
To:     Andrew Morton <akpm@...ux-foundation.org>
Cc:     Alexander Viro <viro@...iv.linux.org.uk>,
        Brian Geffon <bgeffon@...gle.com>,
        Christian Brauner <brauner@...nel.org>,
        David Hildenbrand <david@...hat.com>,
        Gaosheng Cui <cuigaosheng1@...wei.com>,
        Huang Ying <ying.huang@...el.com>,
        Hugh Dickins <hughd@...gle.com>,
        James Houghton <jthoughton@...gle.com>,
        "Jan Alexander Steffens (heftig)" <heftig@...hlinux.org>,
        Jiaqi Yan <jiaqiyan@...gle.com>,
        Jonathan Corbet <corbet@....net>,
        Kefeng Wang <wangkefeng.wang@...wei.com>,
        "Liam R. Howlett" <Liam.Howlett@...cle.com>,
        Miaohe Lin <linmiaohe@...wei.com>,
        Mike Kravetz <mike.kravetz@...cle.com>,
        "Mike Rapoport (IBM)" <rppt@...nel.org>,
        Muchun Song <muchun.song@...ux.dev>,
        Nadav Amit <namit@...are.com>,
        Naoya Horiguchi <naoya.horiguchi@....com>,
        Peter Xu <peterx@...hat.com>,
        Ryan Roberts <ryan.roberts@....com>,
        Shuah Khan <shuah@...nel.org>,
        Suleiman Souhlal <suleiman@...gle.com>,
        Suren Baghdasaryan <surenb@...gle.com>,
        "T.J. Alumbaugh" <talumbau@...gle.com>,
        Yu Zhao <yuzhao@...gle.com>,
        ZhangPeng <zhangpeng362@...wei.com>, linux-doc@...r.kernel.org,
        linux-kernel@...r.kernel.org, linux-fsdevel@...r.kernel.org,
        linux-mm@...ck.org, linux-kselftest@...r.kernel.org
Subject: Re: [PATCH v4 1/8] mm: make PTE_MARKER_SWAPIN_ERROR more general

On Sat, Jul 8, 2023 at 6:08 PM Andrew Morton <akpm@...ux-foundation.org> wrote:
>
> On Fri,  7 Jul 2023 14:55:33 -0700 Axel Rasmussen <axelrasmussen@...gle.com> wrote:
>
> > Future patches will re-use PTE_MARKER_SWAPIN_ERROR to implement
> > UFFDIO_POISON, so make some various preparations for that:
> >
> > First, rename it to just PTE_MARKER_POISONED. The "SWAPIN" can be
> > confusing since we're going to re-use it for something not really
> > related to swap. This can be particularly confusing for things like
> > hugetlbfs, which doesn't support swap whatsoever. Also rename some
> > various helper functions.
> >
> > Next, fix pte marker copying for hugetlbfs. Previously, it would WARN on
> > seeing a PTE_MARKER_SWAPIN_ERROR, since hugetlbfs doesn't support swap.
> > But, since we're going to re-use it, we want it to go ahead and copy it
> > just like non-hugetlbfs memory does today. Since the code to do this is
> > more complicated now, pull it out into a helper which can be re-used in
> > both places. While we're at it, also make it slightly more explicit in
> > its handling of e.g. uffd wp markers.
> >
> > For non-hugetlbfs page faults, instead of returning VM_FAULT_SIGBUS for
> > an error entry, return VM_FAULT_HWPOISON. For most cases this change
> > doesn't matter, e.g. a userspace program would receive a SIGBUS either
> > way. But for UFFDIO_POISON, this change will let KVM guests get an MCE
> > out of the box, instead of giving a SIGBUS to the hypervisor and
> > requiring it to somehow inject an MCE.
> >
> > Finally, for hugetlbfs faults, handle PTE_MARKER_POISONED, and return
> > VM_FAULT_HWPOISON_LARGE in such cases. Note that this can't happen today
> > because the lack of swap support means we'll never end up with such a
> > PTE anyway, but this behavior will be needed once such entries *can*
> > show up via UFFDIO_POISON.
> >
> > --- a/include/linux/mm_inline.h
> > +++ b/include/linux/mm_inline.h
> > @@ -523,6 +523,25 @@ static inline bool mm_tlb_flush_nested(struct mm_struct *mm)
> >       return atomic_read(&mm->tlb_flush_pending) > 1;
> >  }
> >
> > +/*
> > + * Computes the pte marker to copy from the given source entry into dst_vma.
> > + * If no marker should be copied, returns 0.
> > + * The caller should insert a new pte created with make_pte_marker().
> > + */
> > +static inline pte_marker copy_pte_marker(
> > +             swp_entry_t entry, struct vm_area_struct *dst_vma)
> > +{
> > +     pte_marker srcm = pte_marker_get(entry);
> > +     /* Always copy error entries. */
> > +     pte_marker dstm = srcm & PTE_MARKER_POISONED;
> > +
> > +     /* Only copy PTE markers if UFFD register matches. */
> > +     if ((srcm & PTE_MARKER_UFFD_WP) && userfaultfd_wp(dst_vma))
> > +             dstm |= PTE_MARKER_UFFD_WP;
> > +
> > +     return dstm;
> > +}
>
> Breaks the build with CONFIG_MMU=n (arm allnoconfig).  pte_marker isn't
> defined.
>
> I'll slap #ifdef CONFIG_MMU around this function, but probably somethng more
> fine-grained could be used, like CONFIG_PTE_MARKER_UFFD_WP.  Please
> consider.

Whoops, sorry about this. This function "ought" to be in
include/linux/swapops.h where it would be inside a #ifdef CONFIG_MMU
anyway, but it can't be because it uses userfaultfd_wp() so there'd be
a circular include. I think just wrapping it in CONFIG_MMU is the
right way.

But, this has also made me realize we need to not advertise
UFFDIO_POISON as supported unless we have CONFIG_MMU. I don't want
HAVE_ARCH_USERFAULTFD_WP for that, because it's only enabled on
x86_64, whereas I want to support at least arm64 as well. I don't see
a strong reason not to just use CONFIG_MMU for this too; this feature
depends on the API in swapops.h, which uses that ifdef, so I don't see
a lot of value out of creating a new but equivalent config option.

I'll make the needed changes (and also address Peter's comment above)
and send out a v5.

>
> btw, both copy_pte_marker() and pte_install_uffd_wp_if_needed() look
> far too large to justify inlining.  Please review the desirability of
> this.
>
>

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ