[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20200918195740.GB5962@xz-x1>
Date: Fri, 18 Sep 2020 15:57:40 -0400
From: Peter Xu <peterx@...hat.com>
To: Linus Torvalds <torvalds@...ux-foundation.org>
Cc: Jason Gunthorpe <jgg@...pe.ca>, John Hubbard <jhubbard@...dia.com>,
Leon Romanovsky <leonro@...dia.com>,
Linux-MM <linux-mm@...ck.org>,
Linux Kernel Mailing List <linux-kernel@...r.kernel.org>,
"Maya B . Gokhale" <gokhale2@...l.gov>,
Yang Shi <yang.shi@...ux.alibaba.com>,
Marty Mcfadden <mcfadden8@...l.gov>,
Kirill Shutemov <kirill@...temov.name>,
Oleg Nesterov <oleg@...hat.com>, Jann Horn <jannh@...gle.com>,
Jan Kara <jack@...e.cz>, Kirill Tkhai <ktkhai@...tuozzo.com>,
Andrea Arcangeli <aarcange@...hat.com>,
Christoph Hellwig <hch@....de>,
Andrew Morton <akpm@...ux-foundation.org>
Subject: Re: [PATCH 1/4] mm: Trial do_wp_page() simplification
On Fri, Sep 18, 2020 at 10:16:19AM -0700, Linus Torvalds wrote:
> The only change I'd make is to make it a "int" and put it next to the
> "int map_count", since that will pack better on 64-bit (assuming you
> don't do the randomize_layout thing, in which case it's all moot).
Will do.
[...]
> > One issue is when we charge for cgroup we probably can't do that onto the new
> > mm/task, since copy_namespaces() is called after copy_mm().
>
> That cannot possibly matter as far as I can see.
>
> Copying the page in between those two calls is already possible since
> we've already dropped the mmap_lock by the time copy_namespaces() is
> called. So if the parent was threaded, and another thread did a write
> access, the parent would have caused that COW that we did early.
>
> And any page copying cost should be to the parent anyway, since that
> is who did the pinning that caused the copy in the first place.
>
> So for both of those reasons - the COW can already happen between
> copy_mm() and copy_namespaces(), *and* charging it to the parent
> namespace is proper anyway - I think your cgroup worry is not
> relevant.
>
> I'm not even sure anything relevant to accounting is created, but my
> point is that even if it is, I don't see how it could be an issue.
The parent process should be fine to do any COW and do its accounting right,
which I agree. But the new COW that we're adding here is for the child process
rather than the parent.
I'm just afraid the parent process's accounting number could go very high after
it pinned some pages and fork()ed a few more times, since those extra
accountings will accumulate even after the children die, if I'm not wrong...
Actually I tend to move copy_namespaces() to be before copy_mm() now.. I know
nothing about namespaces, however I see that copy_namespaces() seems to be
quite self-contained. But I'm always ready for a "no, you can't"...
>
> > The other thing is on how to fail. E.g., when COW failed due to either
> > charging of cgroup or ENOMEM, ideally we should fail fork() too. Though that
> > might need more changes - current patch silently kept the shared page for
> > simplicity.
>
> We already can fail forkind due to memory allocations failing. Again,
> not an issue. It happens.
Yes. I didn't change this only because I thought it could save quite a few
lines of codes. However after I notice the fact that this patch will probably
grow bigger no matter what, I'm kind of not worrying on this any more..
--
Peter Xu
Powered by blists - more mailing lists