[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <0713a015-b8dc-49db-a329-30891a10378c@linux.ibm.com>
Date: Tue, 18 Mar 2025 18:54:22 +0530
From: Shrikanth Hegde <sshegde@...ux.ibm.com>
To: Sebastian Andrzej Siewior <bigeasy@...utronix.de>
Cc: André Almeida <andrealmeid@...lia.com>,
Darren Hart <dvhart@...radead.org>,
Davidlohr Bueso <dave@...olabs.net>, Ingo Molnar <mingo@...hat.com>,
Juri Lelli <juri.lelli@...hat.com>,
Peter Zijlstra <peterz@...radead.org>,
Thomas Gleixner <tglx@...utronix.de>,
Valentin Schneider <vschneid@...hat.com>,
Waiman Long <longman@...hat.com>, linux-kernel@...r.kernel.org
Subject: Re: [PATCH v10 00/21] futex: Add support task local hash maps,
FUTEX2_NUMA and FUTEX2_MPOL
On 3/12/25 20:46, Sebastian Andrzej Siewior wrote:
> Hi,
>
> this is a follow up on
> https://lore.kernel.org/ZwVOMgBMxrw7BU9A@jlelli-thinkpadt14gen4.remote.csb
>
> and adds support for task local futex_hash_bucket.
>
> This is the local hash map series based on v9 extended with PeterZ
> FUTEX2_NUMA and FUTEX2_MPOL plus a few fixes on top.
>
> The complete tree is at
> https://git.kernel.org/pub/scm/linux/kernel/git/bigeasy/staging.git/log/?h=futex_local_v10
> https://git.kernel.org/pub/scm/linux/kernel/git/bigeasy/staging.git futex_local_v10
>
Hi Sebastian. Thanks for working on this (along with bringing back FUTEX2 NUMA) which
might help large systems with many futexes.
I tried this in one of our systems(Single NUMA, 80 CPUs), I see significant reduction in futex/hash.
Maybe i am missing some config or doing something stupid w.r.t to benchmarking.
I am trying to understand this stuff.
I ran "perf bench futex all" as is. No change has been made to perf.
=========================================
Without patch: at 6575d1b4a6ef3336608127c704b612bc5e7b0fdc
# Running futex/hash benchmark...
Run summary [PID 45758]: 80 threads, each operating on 1024 [private] futexes for 10 secs.
Averaged 1556023 operations/sec (+- 0.08%), total secs = 10 <<--- 1.5M
=========================================
With the Series: I had to make PR_FUTEX_HASH=78 since 77 is used for TIMERs.
# Running futex/hash benchmark...
Run summary [PID 8644]: 80 threads, each operating on 1024 [private] futexes for 10 secs.
Averaged 150382 operations/sec (+- 0.42%), total secs = 10 <<-- 0.15M, close to 10x down.
=========================================
Did try a git bisect based on the futex/hash numbers. It narrowed it to this one.
first bad commit: [5dc017a816766be47ffabe97b7e5f75919756e5c] futex: Allow automatic allocation of process wide futex hash.
Is this expected given the complexity of hash function change?
Also, is there a benchmark that could be run to evaluate FUTEX2_NUMA, I would like to
try it on multi-NUMA system to see the benefit.
Powered by blists - more mailing lists