[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20210311015013.GS444867@ziepe.ca>
Date: Wed, 10 Mar 2021 21:50:13 -0400
From: Jason Gunthorpe <jgg@...pe.ca>
To: Sean Christopherson <seanjc@...gle.com>
Cc: Andrew Morton <akpm@...ux-foundation.org>, linux-mm@...ck.org,
linux-kernel@...r.kernel.org, David Rientjes <rientjes@...gle.com>,
Ben Gardon <bgardon@...gle.com>,
Michal Hocko <mhocko@...e.com>,
Jérôme Glisse <jglisse@...hat.com>,
Andrea Arcangeli <aarcange@...hat.com>,
Johannes Weiner <hannes@...xchg.org>,
Dimitri Sivanich <dimitri.sivanich@....com>
Subject: Re: [PATCH] mm/oom_kill: Ensure MMU notifier range_end() is paired
with range_start()
On Wed, Mar 10, 2021 at 05:20:01PM -0800, Sean Christopherson wrote:
> > Which I believe is fatal to kvm? These notifiers certainly do not only
> > happen at process exit.
>
> My point about the process dying is that the existing bug that causes
> mmu_notifier_count to become imbalanced is benign only because the process is
> being killed, and thus KVM will stop running its vCPUs.
Are you saying we only call non-blocking invalidate during a process
exit event??
> > So, both of the remaining _end users become corrupted with this patch!
>
> I don't follow. mn_hlist_invalidate_range_start() iterates over all
> notifiers, even if a notifier earlier in the chain failed. How will
> KVM become imbalanced?
Er, ok, that got left in a weird way. There is another "bug" where end
is not supposed to be called if the start failed.
> The existing _end users never fail their _start. If KVM started failing its
> start, then yes, it could get corrupted.
Well, maybe that is the way out of this now. If we don't permit a
start to fail if there is an end then we have no problem to unwind it
as we can continue to call everything. This can't be backported too
far though, the itree notifier conversions are what made the WARN_ON
safe today.
Something very approximately like this is closer to my preference:
diff --git a/mm/mmu_notifier.c b/mm/mmu_notifier.c
index 61ee40ed804ee5..6d5cd20f81dadc 100644
--- a/mm/mmu_notifier.c
+++ b/mm/mmu_notifier.c
@@ -501,10 +501,25 @@ static int mn_hlist_invalidate_range_start(
"");
WARN_ON(mmu_notifier_range_blockable(range) ||
_ret != -EAGAIN);
+ /*
+ * We call all the notifiers on any EAGAIN,
+ * there is no way for a notifier to know if
+ * its start method failed, thus a start that
+ * does EAGAIN can't also do end.
+ */
+ WARN_ON(ops->invalidate_range_end);
ret = _ret;
}
}
}
+
+ if (ret) {
+ /* Must be non-blocking to get here*/
+ hlist_for_each_entry_rcu (subscription, &subscriptions->list,
+ hlist, srcu_read_lock_held(&srcu))
+ subscription->ops->invalidate_range_end(subscription,
+ range);
+ }
srcu_read_unlock(&srcu, id);
return ret;
Powered by blists - more mailing lists