[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20241217102620.GC11133@noisy.programming.kicks-ass.net>
Date: Tue, 17 Dec 2024 11:26:20 +0100
From: Peter Zijlstra <peterz@...radead.org>
To: Suren Baghdasaryan <surenb@...gle.com>
Cc: akpm@...ux-foundation.org, willy@...radead.org, liam.howlett@...cle.com,
lorenzo.stoakes@...cle.com, mhocko@...e.com, vbabka@...e.cz,
hannes@...xchg.org, mjguzik@...il.com, oliver.sang@...el.com,
mgorman@...hsingularity.net, david@...hat.com, peterx@...hat.com,
oleg@...hat.com, dave@...olabs.net, paulmck@...nel.org,
brauner@...nel.org, dhowells@...hat.com, hdanton@...a.com,
hughd@...gle.com, lokeshgidra@...gle.com, minchan@...gle.com,
jannh@...gle.com, shakeel.butt@...ux.dev, souravpanda@...gle.com,
pasha.tatashin@...een.com, klarasmodin@...il.com, corbet@....net,
linux-doc@...r.kernel.org, linux-mm@...ck.org,
linux-kernel@...r.kernel.org, kernel-team@...roid.com
Subject: Re: [PATCH v6 13/16] mm: introduce vma_ensure_detached()
On Mon, Dec 16, 2024 at 11:24:16AM -0800, Suren Baghdasaryan wrote:
> vma_start_read() can temporarily raise vm_refcnt of a write-locked and
> detached vma:
>
> // vm_refcnt==1 (attached)
> vma_start_write()
> vma->vm_lock_seq = mm->mm_lock_seq
>
> vma_start_read()
> vm_refcnt++; // vm_refcnt==2
>
> vma_mark_detached()
> vm_refcnt--; // vm_refcnt==1
>
> // vma is detached but vm_refcnt!=0 temporarily
>
> if (vma->vm_lock_seq == mm->mm_lock_seq)
> vma_refcount_put()
> vm_refcnt--; // vm_refcnt==0
>
> This is currently not a problem when freeing the vma because RCU grace
> period should pass before kmem_cache_free(vma) gets called and by that
> time vma_start_read() should be done and vm_refcnt is 0. However once
> we introduce possibility of vma reuse before RCU grace period is over,
> this will become a problem (reused vma might be in non-detached state).
> Introduce vma_ensure_detached() for the writer to wait for readers until
> they exit vma_start_read().
So aside from the lockdep problem (which I think is fixable), the normal
way to fix the above is to make dec_and_test() do the kmem_cache_free().
Then the last user does the free and everything just works.
Powered by blists - more mailing lists