[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <CAJuCfpHrmgdoJaN0P4FGzRFbu-o+c5+H6-r=5A=xrVd2GU2QyQ@mail.gmail.com>
Date: Thu, 28 Sep 2023 08:36:31 -0700
From: Suren Baghdasaryan <surenb@...gle.com>
To: Jann Horn <jannh@...gle.com>
Cc: akpm@...ux-foundation.org, viro@...iv.linux.org.uk,
brauner@...nel.org, shuah@...nel.org, aarcange@...hat.com,
lokeshgidra@...gle.com, peterx@...hat.com, david@...hat.com,
hughd@...gle.com, mhocko@...e.com, axelrasmussen@...gle.com,
rppt@...nel.org, willy@...radead.org, Liam.Howlett@...cle.com,
zhangpeng362@...wei.com, bgeffon@...gle.com,
kaleshsingh@...gle.com, ngeoffray@...gle.com, jdduke@...gle.com,
linux-mm@...ck.org, linux-fsdevel@...r.kernel.org,
linux-kernel@...r.kernel.org, linux-kselftest@...r.kernel.org,
kernel-team@...roid.com
Subject: Re: [PATCH v2 2/3] userfaultfd: UFFDIO_REMAP uABI
On Wed, Sep 27, 2023 at 3:49 PM Jann Horn <jannh@...gle.com> wrote:
>
> On Wed, Sep 27, 2023 at 11:08 PM Suren Baghdasaryan <surenb@...gle.com> wrote:
> > On Wed, Sep 27, 2023 at 1:42 PM Suren Baghdasaryan <surenb@...gle.com> wrote:
> > >
> > > On Wed, Sep 27, 2023 at 1:04 PM Jann Horn <jannh@...gle.com> wrote:
> > > >
> > > > On Wed, Sep 27, 2023 at 8:08 PM Suren Baghdasaryan <surenb@...gle.com> wrote:
> > > > > On Wed, Sep 27, 2023 at 5:47 AM Jann Horn <jannh@...gle.com> wrote:
> > > > > > On Sat, Sep 23, 2023 at 3:31 AM Suren Baghdasaryan <surenb@...gle.com> wrote:
> > > > > > > + dst_pmdval = pmdp_get_lockless(dst_pmd);
> > > > > > > + /*
> > > > > > > + * If the dst_pmd is mapped as THP don't override it and just
> > > > > > > + * be strict. If dst_pmd changes into TPH after this check, the
> > > > > > > + * remap_pages_huge_pmd() will detect the change and retry
> > > > > > > + * while remap_pages_pte() will detect the change and fail.
> > > > > > > + */
> > > > > > > + if (unlikely(pmd_trans_huge(dst_pmdval))) {
> > > > > > > + err = -EEXIST;
> > > > > > > + break;
> > > > > > > + }
> > > > > > > +
> > > > > > > + ptl = pmd_trans_huge_lock(src_pmd, src_vma);
> > > > > > > + if (ptl && !pmd_trans_huge(*src_pmd)) {
> > > > > > > + spin_unlock(ptl);
> > > > > > > + ptl = NULL;
> > > > > > > + }
> > > > > >
> > > > > > This still looks wrong - we do still have to split_huge_pmd()
> > > > > > somewhere so that remap_pages_pte() works.
> > > > >
> > > > > Hmm, I guess this extra check is not even needed...
> > > >
> > > > Hm, and instead we'd bail at the pte_offset_map_nolock() in
> > > > remap_pages_pte()? I guess that's unusual but works...
> > >
> > > Yes, that's what I was thinking but I agree, that seems fragile. Maybe
> > > just bail out early if (ptl && !pmd_trans_huge())?
> >
> > No, actually we can still handle is_swap_pmd() case by splitting it
> > and remapping the individual ptes. So, I can bail out only in case of
> > pmd_devmap().
>
> FWIW I only learned today that "real" swap PMDs don't actually exist -
> only migration entries, which are encoded as swap PMDs, exist. You can
> see that when you look through the cases that something like
> __split_huge_pmd() or zap_pmd_range() actually handles.
Ah, good point.
>
> So I think if you wanted to handle all the PMD types properly here
> without splitting, you could do that without _too_ much extra code.
> But idk if it's worth it.
Yeah, I guess I can call pmd_migration_entry_wait() and retry by
returning EAGAIN, similar to how remap_pages_pte() handles PTE
migration. Looks simple enough.
Thanks for all the pointers! I'll start cooking the next version.
Powered by blists - more mailing lists