[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Message-ID: <20170829201404.GF7546@redhat.com>
Date: Tue, 29 Aug 2017 16:14:04 -0400
From: Jerome Glisse <jglisse@...hat.com>
To: Linus Torvalds <torvalds@...ux-foundation.org>
Cc: Linux Kernel Mailing List <linux-kernel@...r.kernel.org>,
linux-mm <linux-mm@...ck.org>, Bernhard Held <berny156@....de>,
Adam Borowski <kilobyte@...band.pl>,
Andrea Arcangeli <aarcange@...hat.com>,
Radim Krčmář <rkrcmar@...hat.com>,
Wanpeng Li <kernellwp@...il.com>,
Paolo Bonzini <pbonzini@...hat.com>,
Takashi Iwai <tiwai@...e.de>,
Nadav Amit <nadav.amit@...il.com>,
Mike Galbraith <efault@....de>,
"Kirill A . Shutemov" <kirill.shutemov@...ux.intel.com>,
axie <axie@....com>, Andrew Morton <akpm@...ux-foundation.org>
Subject: Re: [RFC PATCH] mm/rmap: do not call mmu_notifier_invalidate_page()
v3
On Tue, Aug 29, 2017 at 12:58:45PM -0700, Linus Torvalds wrote:
> On Tue, Aug 29, 2017 at 12:16 PM, Linus Torvalds
> <torvalds@...ux-foundation.org> wrote:
> >
> > And then you can check if something actually happened by catching the
> > *ATOMIC* call to mmu_notifier_invalidate_page(), setting a flag, and
> > then doing something blocking at mmu_notifier_invalidate_range_end()
> > time.
> >
> > Maybe.
>
> Note that now I have looked more at the users, I think we actually
> just want to get rid of mmu_notifier_invalidate_page() entirely in
> favor of just calling mmu_notifier_invalidate_range_start()/end().
>
> Nobody seems to want an atomic version of
> mmu_notifier_invalidate_page(), they are perfectly happy just getting
> those range_start/end() call instead.
>
> HOWEVER.
>
> There do seem to be places (eg powernv/npu-dma.c, iommu/amd_iommu_v2.c
> and ommu/intel-svm.c) that want to get the "invalidate_page()" or
> "invalidate_range()" calls, but do *not* catch the begin/end() ones.
> The "range" calls were for atomic cases, and the "page" call was for
> the few places that weren't (but should have been). They seem to do
> the same things.
>
> So just switching from mmu_notifier_invalidate_page() to the
> "invalidate_range_start()/end()" pair instead could break those cases.
>
> But the mmu_notifier_invalidate_range() call has always been atomic,
> afaik. It's called from the ptep_clear_flush_notify(), which is
> called while holdin gthe ptl lock as far as I can tell.
>
> So to handle the powernv/npu-dma.c, iommu/amd_iommu_v2.c and
> ommu/intel-svm.c correctly, _and_ get he KVM case right, we probably
> need to:
>
> - replace the existing mmu_notifier_invalidate_page() call with
> mmu_notifier_invalidate_range(), and make sure it's inside the locked
> region (ie fs/dax.c too - actually move it inside the lock)
>
> - surround the locked region with those
> mmu_notifier_invalidate_range_start()/end() calls.
>
> - get rid of mmu_notifier_invalidate_page() entirely, it had bad
> semantics anyway.
>
> and from all I can tell that should work for everybody.
>
> But maybe I'm missing something.
>
So ignore what i just sent, i will rework it toward that direction.
I believe you are right.
Jérôme
Powered by blists - more mailing lists