[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20210401234720.GB628002@xz-x1>
Date: Thu, 1 Apr 2021 19:47:20 -0400
From: Peter Xu <peterx@...hat.com>
To: Suren Baghdasaryan <surenb@...gle.com>
Cc: Linus Torvalds <torvalds@...ux-foundation.org>,
stable <stable@...r.kernel.org>,
Greg Kroah-Hartman <gregkh@...uxfoundation.org>,
Jann Horn <jannh@...gle.com>,
Kirill Tkhai <ktkhai@...tuozzo.com>, Shaohua Li <shli@...com>,
Nadav Amit <namit@...are.com>, Linux-MM <linux-mm@...ck.org>,
Linux Kernel Mailing List <linux-kernel@...r.kernel.org>,
Android Kernel Team <kernel-team@...roid.com>
Subject: Re: [PATCH 0/5] 4.14 backports of fixes for "CoW after fork() issue"
Hi, Suren,
On Thu, Apr 01, 2021 at 12:43:51PM -0700, Suren Baghdasaryan wrote:
> On Thu, Apr 1, 2021 at 11:59 AM Linus Torvalds
> <torvalds@...ux-foundation.org> wrote:
> >
> > On Thu, Apr 1, 2021 at 11:17 AM Suren Baghdasaryan <surenb@...gle.com> wrote:
> > >
> > > We received a report that the copy-on-write issue repored by Jann Horn in
> > > https://bugs.chromium.org/p/project-zero/issues/detail?id=2045 is still
> > > reproducible on 4.14 and 4.19 kernels (the first issue with the reproducer
> > > coded in vmsplice.c).
> >
> > Gaah.
> >
> > > I confirmed this and also that the issue was not
> > > reproducible with 5.10 kernel. I tracked the fix to the following patch
> > > introduced in 5.9 which changes the do_wp_page() logic:
> > >
> > > 09854ba94c6a 'mm: do_wp_page() simplification'
> >
> > The problem here is that there's a _lot_ more patches than the few you
> > found that fixed various other cases (THP etc).
> >
> > > I backported this patch (#2 in the series) along with 2 prerequisite patches
> > > (#1 and #4) that keep the backports clean and two followup fixes to the main
> > > patch (#3 and #5). I had to skip the following fix:
> > >
> > > feb889fb40fa 'mm: don't put pinned pages into the swap cache'
> > >
> > > because it uses page_maybe_dma_pinned() which does not exists in earlier
> > > kernels. Because pin_user_pages() does not exist there as well, I *think*
> > > we can safely skip this fix on older kernels, but I would appreciate if
> > > someone could confirm that claim.
> >
> > Hmm. I think this means that swap activity can now break the
> > connection to a GUP page (the whole pre-pinning model), but it
> > probably isn't a new problem for 4.9/4.19.
> >
> > I suspect the test there should be something like
> >
> > /* Single mapper, more references than us and the map? */
> > if (page_mapcount(page) == 1 && page_count(page) > 2)
> > goto keep_locked;
> >
> > in the pre-pinning days.
> >
> > But I really think that there are a number of other commits you're
> > missing too, because we had a whole series for THP fixes for the same
> > exact issue.
> >
> > Added Peter Xu to the cc, because he probably tracked those issues
> > better than I did.
> >
> > So NAK on this for now, I think this limited patch-set likely
> > introduces more problems than it fixes.
>
> Thanks for confirming my worries. I'll be happy to add additional
> backports if Peter can point me to them.
If for a full-alignment with current upstream, I can at least think of below
series:
Early cow for general pages:
https://lore.kernel.org/lkml/20200925222600.6832-1-peterx@redhat.com/
A race fix for copy_page and gup-fast:
https://lore.kernel.org/linux-mm/0-v4-908497cf359a+4782-gup_fork_jgg@nvidia.com/
Early cow for hugetlbfs (which is very recently):
https://lore.kernel.org/lkml/20210217233547.93892-1-peterx@redhat.com/
But I believe they'll bring a number of dependencies too like the page pinned
work; so seems not easy.
Btw, AFAICT you don't need patch 4/5 in this series for 4.14/4.19, since
those're only for uffd-wp and it doesn't exist until 5.7.
Thanks,
--
Peter Xu
Powered by blists - more mailing lists