[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20240626064359.79119-1-alexjlzheng@tencent.com>
Date: Wed, 26 Jun 2024 14:43:59 +0800
From: Jinliang Zheng <alexjlzheng@...il.com>
To: mhocko@...e.com
Cc: akpm@...ux-foundation.org,
alexjlzheng@...il.com,
alexjlzheng@...cent.com,
axboe@...nel.dk,
brauner@...nel.org,
ebiederm@...ssion.com,
linux-kernel@...r.kernel.org,
linux-mm@...ck.org,
mjguzik@...il.com,
oleg@...hat.com,
tandersen@...flix.com,
willy@...radead.org
Subject: Re: [PATCH v2] mm: optimize the redundant loop of mm_update_next_owner()
On Fri, 21 Jun 2024 10:50:10 +0200, Michal Hocko wrote:
> On Thu 20-06-24 19:30:19, Oleg Nesterov wrote:
> > Can't review, I forgot everything about mm_update_next_owner().
> > So I am sorry for the noise I am going to add, feel free to ignore.
> > Just in case, I see nothing wrong in this patch.
> >
> > On 06/20, alexjlzheng@...il.com wrote:
> > >
> > > When mm_update_next_owner() is racing with swapoff (try_to_unuse()) or /proc or
> > > ptrace or page migration (get_task_mm()), it is impossible to find an
> > > appropriate task_struct in the loop whose mm_struct is the same as the target
> > > mm_struct.
> > >
> > > If the above race condition is combined with the stress-ng-zombie and
> > > stress-ng-dup tests, such a long loop can easily cause a Hard Lockup in
> > > write_lock_irq() for tasklist_lock.
> > >
> > > Recognize this situation in advance and exit early.
> >
> > But this patch won't help if (say) ptrace_access_vm() sleeps while
> > for_each_process() tries to find another owner, right?
> >
> > > @@ -484,6 +484,8 @@ void mm_update_next_owner(struct mm_struct *mm)
> > > * Search through everything else, we should not get here often.
> > > */
> > > for_each_process(g) {
> > > + if (atomic_read(&mm->mm_users) <= 1)
> > > + break;
> >
> > I think this deserves a comment to explain that this is optimization
> > for the case we race with the pending mmput(). mm_update_next_owner()
> > checks mm_users at the start.
> >
> > And. Can we drop tasklist and use rcu_read_lock() before for_each_process?
> > Yes, this will probably need more changes even if possible...
> >
> >
> > Or even better. Can't we finally kill mm_update_next_owner() and turn the
> > ugly mm->owner into mm->mem_cgroup ?
>
> Yes, dropping the mm->owner should be a way to go. Replacing that by
> mem_cgroup sounds like an improvemnt. I have a vague recollection that
Sorry for the late reply.
Replacing that by mem_cgroup maybe a good idea, a rcu lock looks good, too.
But before the above optimization is implemented, I recommend using this
patch to alleviate it. Both [PATCH] and [PATCH v2] are acceptable, they only
differ in the commit log.
Thanks for your reply. :)
Jinliang Zheng
> this has some traps on the way. E.g. tasks sharing the mm but living in
> different cgroups. Things have changes since the last time I've checked
> and for example memcg charge migration on task move will be deprecated
> soon so chances are that there are less roadblocks on the way.
> --
> Michal Hocko
> SUSE Labs
Powered by blists - more mailing lists