lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20241029150642.2tDTBvuF@linutronix.de>
Date: Tue, 29 Oct 2024 16:06:42 +0100
From: Sebastian Andrzej Siewior <bigeasy@...utronix.de>
To: Juri Lelli <juri.lelli@...hat.com>
Cc: linux-kernel@...r.kernel.org,
	André Almeida <andrealmeid@...lia.com>,
	Darren Hart <dvhart@...radead.org>,
	Davidlohr Bueso <dave@...olabs.net>, Ingo Molnar <mingo@...hat.com>,
	Peter Zijlstra <peterz@...radead.org>,
	Valentin Schneider <vschneid@...hat.com>,
	Waiman Long <longman@...hat.com>
Subject: Re: [RFC v2 PATCH 0/4] futex: Add support task local hash maps.

On 2024-10-29 12:10:25 [+0100], Juri Lelli wrote:
> Hi Sebastian,
Hi Juri,

> > I've been how this auto-create behaves and so far dpkg creates threads
> > and uses the local-hashmap. systemd-journal on the hand forks a thread
> > from time to time and I haven't seen it using the hashmap. Need to do
> > more testing.
> 
> I ported it to one of our kernels with the intent of asking perf folks
> to have a go at it (after some manual smoke testing maybe). It will
> take a couple of weeks or so to get numbers back.

Thanks.

> Do you need specific additional info to possibly be collected while
> running? I saw your reply about usage. If you want to agree on what to
> collect feel free to send out the debug patch I guess you used for that.

If you run a specific locking test cases, you could try set the number of
slots upfront (instead of relying on the default 4) and see how this
affects the performance. Also there is a cap at 16, you might want to
raise this to 1024 and try some higher numbers and see how this effects
performance. The prctl() interface should be easy to set/ get the values.
The default 4 might be too conservative.
That would give an idea what a sane default value and upper limit might be.

The hunk attached (against the to be posted v3) adds counters to see how
many auto-allocated slots were used vs not used. In my tests the number
of unused hash buckets was very small, so I don't think it matters.

> Best,
> Juri

Sebastian

---------------------->8---------------------

diff --git a/include/linux/sched/signal.h b/include/linux/sched/signal.h
index 3b8c8975cd493..aa2a0d059b1a8 100644
--- a/include/linux/sched/signal.h
+++ b/include/linux/sched/signal.h
@@ -248,6 +248,7 @@ struct signal_struct {
 						 * and may have inconsistent
 						 * permissions.
 						 */
+	unsigned int			futex_hash_used;
 	unsigned int			futex_hash_mask;
 	struct futex_hash_bucket	*futex_hash_bucket;
 } __randomize_layout;
diff --git a/kernel/fork.c b/kernel/fork.c
index e792a43934363..341331778032a 100644
--- a/kernel/fork.c
+++ b/kernel/fork.c
@@ -945,10 +945,19 @@ static void mmdrop_async(struct mm_struct *mm)
 	}
 }
 
+extern atomic64_t futex_hash_stats_used;
+extern atomic64_t futex_hash_stats_unused;
+
 static inline void free_signal_struct(struct signal_struct *sig)
 {
 	taskstats_tgid_free(sig);
 	sched_autogroup_exit(sig);
+	if (sig->futex_hash_bucket) {
+		if (sig->futex_hash_used)
+			atomic64_inc(&futex_hash_stats_used);
+		else
+			atomic64_inc(&futex_hash_stats_unused);
+	}
 	kfree(sig->futex_hash_bucket);
 	/*
 	 * __mmdrop is not safe to call from softirq context on x86 due to
diff --git a/kernel/futex/core.c b/kernel/futex/core.c
index b48abf2e97c25..04a597736cb00 100644
--- a/kernel/futex/core.c
+++ b/kernel/futex/core.c
@@ -40,6 +40,7 @@
 #include <linux/fault-inject.h>
 #include <linux/slab.h>
 #include <linux/prctl.h>
+#include <linux/proc_fs.h>
 
 #include "futex.h"
 #include "../locking/rtmutex_common.h"
@@ -132,8 +133,10 @@ struct futex_hash_bucket *futex_hash(union futex_key *key)
 			  key->both.offset);
 
 	fhb = current->signal->futex_hash_bucket;
-	if (fhb && futex_key_is_private(key))
+	if (fhb && futex_key_is_private(key)) {
+		current->signal->futex_hash_used = 1;
 		return &fhb[hash & current->signal->futex_hash_mask];
+	}
 
 	return &futex_queues[hash & (futex_hashsize - 1)];
 }
@@ -1202,8 +1205,13 @@ static int futex_hash_allocate(unsigned int hash_slots)
 	return 0;
 }
 
+atomic64_t futex_hash_stats_used;
+atomic64_t futex_hash_stats_unused;
+atomic64_t futex_hash_stats_auto_create;
+
 int futex_hash_allocate_default(void)
 {
+	atomic64_inc(&futex_hash_stats_auto_create);
 	return futex_hash_allocate(0);
 }
 
@@ -1235,6 +1243,19 @@ int futex_hash_prctl(unsigned long arg2, unsigned long arg3,
 	return ret;
 }
 
+static int proc_show_futex_stats(struct seq_file *seq, void *offset)
+{
+	long fh_used, fh_unused, fh_auto_create;
+
+	fh_used = atomic64_read(&futex_hash_stats_used);
+	fh_unused = atomic64_read(&futex_hash_stats_unused);
+	fh_auto_create = atomic64_read(&futex_hash_stats_auto_create);
+
+	seq_printf(seq, "used: %ld unsued: %ld auto: %ld\n",
+		   fh_used, fh_unused, fh_auto_create);
+	return 0;
+}
+
 static int __init futex_init(void)
 {
 	unsigned int futex_shift;
@@ -1255,6 +1276,7 @@ static int __init futex_init(void)
 	for (i = 0; i < futex_hashsize; i++)
 		futex_hash_bucket_init(&futex_queues[i]);
 
+	proc_create_single("futex_stats", 0, NULL, proc_show_futex_stats);
 	return 0;
 }
 core_initcall(futex_init);

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ