[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <CAJuCfpEV_-=19KWqXvYO4-VgUDwgpkT2xDC4zTZ-XS4iaSH=Qw@mail.gmail.com>
Date: Tue, 17 Jan 2023 14:33:08 -0800
From: Suren Baghdasaryan <surenb@...gle.com>
To: Matthew Wilcox <willy@...radead.org>
Cc: Michal Hocko <mhocko@...e.com>, akpm@...ux-foundation.org,
michel@...pinasse.org, jglisse@...gle.com, vbabka@...e.cz,
hannes@...xchg.org, mgorman@...hsingularity.net, dave@...olabs.net,
liam.howlett@...cle.com, peterz@...radead.org,
ldufour@...ux.ibm.com, laurent.dufour@...ibm.com,
paulmck@...nel.org, luto@...nel.org, songliubraving@...com,
peterx@...hat.com, david@...hat.com, dhowells@...hat.com,
hughd@...gle.com, bigeasy@...utronix.de, kent.overstreet@...ux.dev,
punit.agrawal@...edance.com, lstoakes@...il.com,
peterjung1337@...il.com, rientjes@...gle.com,
axelrasmussen@...gle.com, joelaf@...gle.com, minchan@...gle.com,
jannh@...gle.com, shakeelb@...gle.com, tatashin@...gle.com,
edumazet@...gle.com, gthelen@...gle.com, gurua@...gle.com,
arjunroy@...gle.com, soheil@...gle.com, hughlynch@...gle.com,
leewalsh@...gle.com, posk@...gle.com, linux-mm@...ck.org,
linux-arm-kernel@...ts.infradead.org,
linuxppc-dev@...ts.ozlabs.org, x86@...nel.org,
linux-kernel@...r.kernel.org, kernel-team@...roid.com
Subject: Re: [PATCH 12/41] mm: add per-VMA lock and helper functions to
control it
On Tue, Jan 17, 2023 at 1:54 PM Matthew Wilcox <willy@...radead.org> wrote:
>
> On Tue, Jan 17, 2023 at 01:21:47PM -0800, Suren Baghdasaryan wrote:
> > On Tue, Jan 17, 2023 at 7:12 AM Michal Hocko <mhocko@...e.com> wrote:
> > >
> > > On Tue 17-01-23 16:04:26, Michal Hocko wrote:
> > > > On Mon 09-01-23 12:53:07, Suren Baghdasaryan wrote:
> > > > > Introduce a per-VMA rw_semaphore to be used during page fault handling
> > > > > instead of mmap_lock. Because there are cases when multiple VMAs need
> > > > > to be exclusively locked during VMA tree modifications, instead of the
> > > > > usual lock/unlock patter we mark a VMA as locked by taking per-VMA lock
> > > > > exclusively and setting vma->lock_seq to the current mm->lock_seq. When
> > > > > mmap_write_lock holder is done with all modifications and drops mmap_lock,
> > > > > it will increment mm->lock_seq, effectively unlocking all VMAs marked as
> > > > > locked.
> > > >
> > > > I have to say I was struggling a bit with the above and only understood
> > > > what you mean by reading the patch several times. I would phrase it like
> > > > this (feel free to use if you consider this to be an improvement).
> > > >
> > > > Introduce a per-VMA rw_semaphore. The lock implementation relies on a
> > > > per-vma and per-mm sequence counters to note exclusive locking:
> > > > - read lock - (implemented by vma_read_trylock) requires the the
> > > > vma (vm_lock_seq) and mm (mm_lock_seq) sequence counters to
> > > > differ. If they match then there must be a vma exclusive lock
> > > > held somewhere.
> > > > - read unlock - (implemented by vma_read_unlock) is a trivial
> > > > vma->lock unlock.
> > > > - write lock - (vma_write_lock) requires the mmap_lock to be
> > > > held exclusively and the current mm counter is noted to the vma
> > > > side. This will allow multiple vmas to be locked under a single
> > > > mmap_lock write lock (e.g. during vma merging). The vma counter
> > > > is modified under exclusive vma lock.
> > >
> > > Didn't realize one more thing.
> > > Unlike standard write lock this implementation allows to be
> > > called multiple times under a single mmap_lock. In a sense
> > > it is more of mark_vma_potentially_modified than a lock.
> >
> > In the RFC it was called vma_mark_locked() originally and renames were
> > discussed in the email thread ending here:
> > https://lore.kernel.org/all/621612d7-c537-3971-9520-a3dec7b43cb4@suse.cz/.
> > If other names are preferable I'm open to changing them.
>
> I don't want to bikeshed this, but rather than locking it seems to be
> more:
>
> vma_start_read()
> vma_end_read()
> vma_start_write()
> vma_end_write()
> vma_downgrade_write()
Couple corrections, we would have to have vma_start_tryread() and
vma_end_write_all(). Also there is no vma_downgrade_write().
mmap_write_downgrade() simply does vma_end_write_all().
>
> ... and that these are _implemented_ with locks (in part) is an
> implementation detail?
>
> Would that reduce people's confusion?
>
> > >
> > > > - write unlock - (vma_write_unlock_mm) is a batch release of all
> > > > vma locks held. It doesn't pair with a specific
> > > > vma_write_lock! It is done before exclusive mmap_lock is
> > > > released by incrementing mm sequence counter (mm_lock_seq).
> > > > - write downgrade - if the mmap_lock is downgraded to the read
> > > > lock all vma write locks are released as well (effectivelly
> > > > same as write unlock).
> > > --
> > > Michal Hocko
> > > SUSE Labs
Powered by blists - more mailing lists