linux-kernel - Re: [PATCH RFC v2 00/10] SLUB percpu sheaves

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <CAJuCfpHHXYKGjaOxHcuJcuQbUVO7YqLMpcYeF3HM5Ayxy1fE+g@mail.gmail.com>
Date: Mon, 24 Feb 2025 13:12:35 -0800
From: Suren Baghdasaryan <surenb@...gle.com>
To: Vlastimil Babka <vbabka@...e.cz>
Cc: Kent Overstreet <kent.overstreet@...ux.dev>, "Liam R. Howlett" <Liam.Howlett@...cle.com>, 
	Christoph Lameter <cl@...ux.com>, David Rientjes <rientjes@...gle.com>, 
	Roman Gushchin <roman.gushchin@...ux.dev>, Hyeonggon Yoo <42.hyeyoo@...il.com>, 
	Uladzislau Rezki <urezki@...il.com>, linux-mm@...ck.org, linux-kernel@...r.kernel.org, 
	rcu@...r.kernel.org, maple-tree@...ts.infradead.org, 
	Sebastian Andrzej Siewior <bigeasy@...utronix.de>, Alexei Starovoitov <ast@...nel.org>
Subject: Re: [PATCH RFC v2 00/10] SLUB percpu sheaves

On Mon, Feb 24, 2025 at 12:53 PM Vlastimil Babka <vbabka@...e.cz> wrote:
>
> On 2/24/25 02:36, Suren Baghdasaryan wrote:
> > On Sat, Feb 22, 2025 at 8:44 PM Suren Baghdasaryan <surenb@...gle.com> wrote:
> >>
> >> Don't know about this particular part but testing sheaves with maple
> >> node cache and stress testing mmap/munmap syscalls shows performance
> >> benefits as long as there is some delay to let kfree_rcu() do its job.
> >> I'm still gathering results and will most likely post them tomorrow.
>
> Without such delay, the perf is same or worse?

The perf is about the same if there is no delay.

>
> > Here are the promised test results:
> >
> > First I ran an Android app cycle test comparing the baseline against sheaves
> > used for maple tree nodes (as this patchset implements). I registered about
> > 3% improvement in app launch times, indicating improvement in mmap syscall
> > performance.
>
> There was no artificial 500us delay added for this test, right?

Correct. No artificial changes in this test.

>
> > Next I ran an mmap stress test which maps 5 1-page readable file-backed
> > areas, faults them in and finally unmaps them, timing mmap syscalls.
> > Repeats that 200000 cycles and reports the total time. Average of 10 such
> > runs is used as the final result.
> > 3 configurations were tested:
> >
> > 1. Sheaves used for maple tree nodes only (this patchset).
> >
> > 2. Sheaves used for maple tree nodes with vm_lock to vm_refcnt conversion [1].
> > This patchset avoids allocating additional vm_lock structure on each mmap
> > syscall and uses TYPESAFE_BY_RCU for vm_area_struct cache.
> >
> > 3. Sheaves used for maple tree nodes and for vm_area_struct cache with vm_lock
> > to vm_refcnt conversion [1]. For the vm_area_struct cache I had to replace
> > TYPESAFE_BY_RCU with sheaves, as we can't use both for the same cache.
>
> Hm why we can't use both? I don't think any kmem_cache_create check makes
> them exclusive? TYPESAFE_BY_RCU only affects how slab pages are freed, it
> doesn't e.g. delay reuse of individual objects, and caching in a sheaf
> doesn't write to the object. Am I missing something?

Ah, I was under impression that to use sheaves I would have to ensure
the freeing happens via kfree_rcu()->kfree_rcu_sheaf() path but now
that you mentioned that, I guess I could keep using kmem_cache_free()
and that would use free_to_pcs() internally... When time comes to free
the page, TYPESAFE_BY_RCU will free it after the grace period.
I can try that combination as well and see if anything breaks.

>
> > The values represent the total time it took to perform mmap syscalls, less is
> > better.
> >
> > (1)                  baseline       control
> > Little core       7.58327       6.614939 (-12.77%)
> > Medium core  2.125315     1.428702 (-32.78%)
> > Big core          0.514673     0.422948 (-17.82%)
> >
> > (2)                  baseline      control
> > Little core       7.58327       5.141478 (-32.20%)
> > Medium core  2.125315     0.427692 (-79.88%)
> > Big core          0.514673    0.046642 (-90.94%)
> >
> > (3)                   baseline      control
> > Little core        7.58327      4.779624 (-36.97%)
> > Medium core   2.125315    0.450368 (-78.81%)
> > Big core           0.514673    0.037776 (-92.66%)
> >
> > Results in (3) vs (2) indicate that using sheaves for vm_area_struct
> > yields slightly better averages and I noticed that this was mostly due
> > to sheaves results missing occasional spikes that worsened
> > TYPESAFE_BY_RCU averages (the results seemed more stable with
> > sheaves).
>
> Thanks a lot, that looks promising!

Indeed, that looks better than I expected :)
Cheers!

>
> > [1] https://lore.kernel.org/all/20250213224655.1680278-1-surenb@google.com/
> >
>