[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Message-ID: <20140402164324.GK1500@redhat.com>
Date: Wed, 2 Apr 2014 18:43:24 +0200
From: Andrea Arcangeli <aarcange@...hat.com>
To: Jerome Glisse <j.glisse@...il.com>
Cc: Haggai Eran <haggaie@...lanox.com>,
Mike Rapoport <mike.rapoport@...ellosystems.com>,
linux-mm@...ck.org, linux-kernel@...r.kernel.org,
Izik Eidus <izik.eidus@...ellosystems.com>,
Peter Zijlstra <a.p.zijlstra@...llo.nl>,
Or Gerlitz <ogerlitz@...lanox.com>,
Sagi Grimberg <sagig@...lanox.com>,
Shachar Raindel <raindel@...lanox.com>
Subject: Re: [PATCH] mm/mmu_notifier: restore set_pte_at_notify semantics
Hi,
On Wed, Apr 02, 2014 at 11:18:27AM -0400, Jerome Glisse wrote:
> This would imply either to scan all mmu_notifier currently register or to
> have a global flags for the mm to know if there is one mmu_notifier without
> change_pte. Moreover this would means that kvm would remain "broken" if one
> of the mmu notifier do not have the change_pte callback.
>
> Solution i have in mind and is part of a patchset i am working on, just
> involve passing along an enum value to mmu notifier callback. The enum
> value would tell what are the exact event that actually triggered the
> mmu notifier call (vmscan, migrate, ksm, ...). Knowing this kvm could then
> simply ignore invalidate_range_start/end for event it knows it will get
> a change_pte callback.
That sounds similar to adding two new methods companion of change_pte
that if not implemented would fallback into
invalidate_range_start/end? And KVM would implement those as noops? It
would add bytes to the mmu notifer structure but it wouldn't add
branches.
It's not urgent but we clearly need to do something about this, or we
should drop change_pte entirely because currently it does nothing and
it only wastes some CPU cycle.
Removing change_pte these days isn't a showstopper because KVM page
fault become smarter lately. In the old days lack of change_pte would
mean that the guest would break KSM cow if it ever accessed the page
in readonly (sharable) mode. Back then change_pte was a fundamental
optimization to use KSM and to avoid all KSM pages to be cowed
immediately after being merged.
These days reading from guest memory backed by KSM won't break COW
even if the spte isn't already established before the KVM fault fires
on the KSM memory. change_pte these days has only the benefit of
avoiding a vmexit/vmenter cycle after a the KSM merge and one
vmexit/vmenter cyle after a KSM break COW (the event that triggers if
the guest eventually writes to the page).
KSM merge and KSM cows aren't too frequent operations (and they both
have significant cost associated with them) so it's uncertain if it's
worth keeping the change_pte optimization nowadays. Considering it's
already implemented I personally feel it's worth keeping as a
microoptimization because vmexit/vmenter are certainly more expensive
than calling change_pte.
Both what you suggested above (with enum or two new companion methods)
or the other way Haggai suggested (checking if change_pte is
implemented in all registered mmu notifiers) sounds good to me.
Thanks,
Andrea
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
Powered by blists - more mailing lists