[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <ZBnIyjTJ5yfxpcgs@x1n>
Date: Tue, 21 Mar 2023 11:10:02 -0400
From: Peter Xu <peterx@...hat.com>
To: Mike Rapoport <rppt@...nel.org>
Cc: Andrei Vagin <avagin@...il.com>,
Andrew Morton <akpm@...ux-foundation.org>,
Muhammad Usama Anjum <usama.anjum@...labora.com>,
David Hildenbrand <david@...hat.com>,
Michał Mirosław <emmir@...gle.com>,
Danylo Mocherniuk <mdanylo@...gle.com>,
Paul Gofman <pgofman@...eweavers.com>,
Cyrill Gorcunov <gorcunov@...il.com>,
Nadav Amit <namit@...are.com>,
Alexander Viro <viro@...iv.linux.org.uk>,
Shuah Khan <shuah@...nel.org>,
Christian Brauner <brauner@...nel.org>,
Yang Shi <shy828301@...il.com>,
Vlastimil Babka <vbabka@...e.cz>,
"Liam R . Howlett" <Liam.Howlett@...cle.com>,
Yun Zhou <yun.zhou@...driver.com>,
Suren Baghdasaryan <surenb@...gle.com>,
Alex Sierra <alex.sierra@....com>,
Matthew Wilcox <willy@...radead.org>,
Pasha Tatashin <pasha.tatashin@...een.com>,
Axel Rasmussen <axelrasmussen@...gle.com>,
"Gustavo A . R . Silva" <gustavoars@...nel.org>,
Dan Williams <dan.j.williams@...el.com>,
linux-kernel@...r.kernel.org, linux-fsdevel@...r.kernel.org,
linux-mm@...ck.org, linux-kselftest@...r.kernel.org,
Greg KH <gregkh@...uxfoundation.org>, kernel@...labora.com
Subject: Re: [PATCH v11 0/7] Implement IOCTL to get and optionally clear info
about PTEs
On Tue, Mar 21, 2023 at 02:41:53PM +0200, Mike Rapoport wrote:
> On Mon, Mar 20, 2023 at 11:30:00AM -0700, Andrei Vagin wrote:
> > On Thu, Mar 9, 2023 at 11:58 AM Andrew Morton <akpm@...ux-foundation.org> wrote:
> > >
> > > On Thu, 9 Mar 2023 18:57:11 +0500 Muhammad Usama Anjum <usama.anjum@...labora.com> wrote:
> > >
> > > > The information related to pages if the page is file mapped, present and
> > > > swapped is required for the CRIU project [5][6]. The addition of the
> > > > required mask, any mask, excluded mask and return masks are also required
> > > > for the CRIU project [5].
> > >
> > > It's a ton of new code and what I'm not seeing in here (might have
> > > missed it?) is a clear statement of the value of this feature to our
> > > users.
> > >
> > > I see hints that CRIU would like it, but no description of how valuable
> > > this is to CRIU's users.
> >
> > Hi Andrew,
> >
> > The current interface works for CRIU, and I can't say we have anything
> > critical with it right now.
> >
> > On the other hand, the new interface has a number of significant improvements:
> >
> > * it is more granular and allows us to track changed pages more
> > effectively. The current interface can clear dirty bits for the entire
> > process only. In addition, reading info about pages is a separate
> > operation. It means we must freeze the process to read information
> > about all its pages, reset dirty bits, only then we can start dumping
> > pages. The information about pages becomes more and more outdated,
> > while we are processing pages. The new interface solves both these
> > downsides. First, it allows us to read pte bits and clear the
> > soft-dirty bit atomically. It means that CRIU will not need to freeze
> > processes to pre-dump their memory. Second, it clears soft-dirty bits
> > for a specified region of memory. It means CRIU will have actual info
> > about pages to the moment of dumping them.
> >
> > * The new interface has to be much faster because basic page filtering
> > is happening in the kernel. With the old interface, we have to read
> > pagemap for each page.
>
> There is still a caveat in using userfaultfd for tracking dirty pages in
> CRIU because we still don't support C/R of processes that use uffd.
This reminded me whether the interface can also expose soft-dirty as a
ranged soft-dirty collector too to replace existing pagemap read()s? Just
in case userfault cannot be used. The code addition should be trivial IIUC.
Then maybe PAGE_IS_WRITTEN will be a name too generic, it can be two bits
PAGE_IS_UFFD_WP and PAGE_IS_SOFT_DIRTY, having PAGE_IS_UFFD_WP the inverted
meaning of current PAGE_IS_WRITTEN.
--
Peter Xu
Powered by blists - more mailing lists