[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <CAJuCfpFSJLhkv4o6ZKQKDWKsg2OKYa9kwRLERJkF=B+nzoujCQ@mail.gmail.com>
Date: Thu, 19 Dec 2024 08:14:24 -0800
From: Suren Baghdasaryan <surenb@...gle.com>
To: Peter Zijlstra <peterz@...radead.org>
Cc: "Liam R. Howlett" <Liam.Howlett@...cle.com>, akpm@...ux-foundation.org, willy@...radead.org,
lorenzo.stoakes@...cle.com, mhocko@...e.com, vbabka@...e.cz,
hannes@...xchg.org, mjguzik@...il.com, oliver.sang@...el.com,
mgorman@...hsingularity.net, david@...hat.com, peterx@...hat.com,
oleg@...hat.com, dave@...olabs.net, paulmck@...nel.org, brauner@...nel.org,
dhowells@...hat.com, hdanton@...a.com, hughd@...gle.com,
lokeshgidra@...gle.com, minchan@...gle.com, jannh@...gle.com,
shakeel.butt@...ux.dev, souravpanda@...gle.com, pasha.tatashin@...een.com,
klarasmodin@...il.com, corbet@....net, linux-doc@...r.kernel.org,
linux-mm@...ck.org, linux-kernel@...r.kernel.org, kernel-team@...roid.com
Subject: Re: [PATCH v6 10/16] mm: replace vm_lock and detached flag with a
reference count
On Thu, Dec 19, 2024 at 1:13 AM Peter Zijlstra <peterz@...radead.org> wrote:
>
> On Wed, Dec 18, 2024 at 01:53:17PM -0800, Suren Baghdasaryan wrote:
>
> > Ah, ok I see now. I completely misunderstood what for_each_vma_range()
> > was doing.
> >
> > Then I think vma_start_write() should remain inside
> > vms_gather_munmap_vmas() and all vmas in mas_detach should be
>
> No, it must not. You really are not modifying anything yet (except the
> split, which we've already noted mark write themselves).
>
> > write-locked, even the ones we are not modifying. Otherwise what would
> > prevent the race I mentioned before?
> >
> > __mmap_region
> > __mmap_prepare
> > vms_gather_munmap_vmas // adds vmas to be unmapped into mas_detach,
> > // some locked
> > by __split_vma(), some not locked
> >
> > lock_vma_under_rcu()
> > vma = mas_walk // finds
> > unlocked vma also in mas_detach
> > vma_start_read(vma) //
> > succeeds since vma is not locked
> > // vma->detached, vm_start,
> > vm_end checks pass
> > // vma is successfully read-locked
> >
> > vms_clean_up_area(mas_detach)
> > vms_clear_ptes
> > // steps on a cleared PTE
>
> So here we have the added complexity that the vma is not unhooked at
> all. Is there anything that would prevent a concurrent gup_fast() from
> doing the same -- touch a cleared PTE?
>
> AFAICT two threads, one doing overlapping mmap() and the other doing
> gup_fast() can result in exactly this scenario.
>
> If we don't care about the GUP case, when I'm thinking we should not
> care about the lockless RCU case either.
>
> > __mmap_new_vma
> > vma_set_range // installs new vma in the range
> > __mmap_complete
> > vms_complete_munmap_vmas // vmas are write-locked and detached
> > but it's too late
>
> But at this point that old vma really is unhooked, and the
> vma_write_start() here will ensure readers are gone and it will clear
> PTEs *again*.
So, to summarize, you want vma_start_write() and vma_mark_detached()
to be done when we are removing the vma from the tree, right?
Something like:
vma_start_write()
vma_iter_store()
vma_mark_detached()
And the race I described is not a real problem since the vma is still
in the tree, so gup_fast() does exactly that and will simply reinstall
the ptes.
>
>
Powered by blists - more mailing lists