lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Message-ID: <CAJuCfpE=W4RUwj7yosa3wWzi=EbLdts=1VHV1f-Wy04ZAc9UDw@mail.gmail.com>
Date: Wed, 8 Jan 2025 11:17:13 -0800
From: Suren Baghdasaryan <surenb@...gle.com>
To: Vlastimil Babka <vbabka@...e.cz>
Cc: akpm@...ux-foundation.org, peterz@...radead.org, willy@...radead.org, 
	liam.howlett@...cle.com, lorenzo.stoakes@...cle.com, mhocko@...e.com, 
	hannes@...xchg.org, mjguzik@...il.com, oliver.sang@...el.com, 
	mgorman@...hsingularity.net, david@...hat.com, peterx@...hat.com, 
	oleg@...hat.com, dave@...olabs.net, paulmck@...nel.org, brauner@...nel.org, 
	dhowells@...hat.com, hdanton@...a.com, hughd@...gle.com, 
	lokeshgidra@...gle.com, minchan@...gle.com, jannh@...gle.com, 
	shakeel.butt@...ux.dev, souravpanda@...gle.com, pasha.tatashin@...een.com, 
	klarasmodin@...il.com, corbet@....net, linux-doc@...r.kernel.org, 
	linux-mm@...ck.org, linux-kernel@...r.kernel.org, kernel-team@...roid.com
Subject: Re: [PATCH v7 16/17] mm: make vma cache SLAB_TYPESAFE_BY_RCU

On Wed, Jan 8, 2025 at 11:00 AM Vlastimil Babka <vbabka@...e.cz> wrote:
>
> On 1/8/25 19:44, Suren Baghdasaryan wrote:
> > On Wed, Jan 8, 2025 at 10:21 AM Vlastimil Babka <vbabka@...e.cz> wrote:
> >>
> >> On 12/26/24 18:07, Suren Baghdasaryan wrote:
> >> > To enable SLAB_TYPESAFE_BY_RCU for vma cache we need to ensure that
> >> > object reuse before RCU grace period is over will be detected by
> >> > lock_vma_under_rcu(). Current checks are sufficient as long as vma
> >> > is detached before it is freed. Implement this guarantee by calling
> >> > vma_ensure_detached() before vma is freed and make vm_area_cachep
> >> > SLAB_TYPESAFE_BY_RCU. This will facilitate vm_area_struct reuse and
> >> > will minimize the number of call_rcu() calls.
> >> >
> >> > Signed-off-by: Suren Baghdasaryan <surenb@...gle.com>
> >>
> >> I've noticed vm_area_dup() went back to the approach of "we memcpy
> >> everything including vma_lock and detached (now the vm_refcnt) followed by a
> >> vma_init_lock(..., true) that does refcount_set(&vma->vm_refcnt, 0);
> >> Is that now safe against a racing lock_vma_under_rcu()? I think it's not?
> >
> > I think it's safe because vma created by vm_area_dup() is not in the
> > vma tree yet, so lock_vma_under_rcu() does not see it until it's added
> > into the tree. Note also that at the time when the new vma gets added
> > into the tree, the vma has to be write-locked
> > (vma_iter_store()->vma_mark_attached()->vma_assert_write_locked()).
> > So, lock_vma_under_rcu() won't use the new vma even after it's added
> > into the tree until we unlock the vma.
>
>
> What about something like this, where vma starts out as attached as thus
> reachable:

Huh, very clever sequence.

>
> A:                      B:      C:
> lock_vma_under_rcu()
>   vma = mas_walk()
>   vma_start_read()
>     vm_lock_seq == mm->mm_lock_seq.sequence
>
                           vma_start_write
>                         vma detached and freed
>
>                                 vm_area_dup()
>                                 - vma reallocated
>                                 - memcpy() copies non-zero refcnt from orig
>
>     __refcount_inc_not_zero_limited() succeeds
>
>                                 vma_init_lock();
>                                 refcount_set(&vma->vm_refcnt, 0);
>
>     - vm_lock_seq validation fails (could it even succeed?)

It can succeed if task C drops the vma write-lock before A validates
vm_lock_seq.

>     vma_refcount_put(vma);
>       __refcount_dec_and_test makes refcount -1

Yeah, I guess I will have to keep vm_refcnt at 0 across reuse, so
memcpy() in vm_area_dup() should be replaced. I'll make the changes.
Thanks for analyzing this, Vlastimil!

>
>
>

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ