[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Message-ID: <CAGudoHFTikYFRoJu2mRhNFv6GHPP4LNEDetMdsqkzAg1nTJfRA@mail.gmail.com>
Date: Fri, 28 Nov 2025 11:11:46 +0100
From: Mateusz Guzik <mjguzik@...il.com>
To: kernel test robot <oliver.sang@...el.com>
Cc: Linus Torvalds <torvalds@...ux-foundation.org>, oe-lkp@...ts.linux.dev, lkp@...el.com,
linux-kernel@...r.kernel.org, Borislav Petkov <bp@...en8.de>,
Sean Christopherson <seanjc@...gle.com>, Thomas Gleixner <tglx@...utronix.de>
Subject: Re: [linus:master] [x86] 284922f4c5: stress-ng.sockfd.ops_per_sec
6.1% improvement
On Fri, Nov 28, 2025 at 7:30 AM kernel test robot <oliver.sang@...el.com> wrote:
>
>
>
> Hello,
>
> kernel test robot noticed a 6.1% improvement of stress-ng.sockfd.ops_per_sec on:
>
>
> commit: 284922f4c563aa3a8558a00f2a05722133237fe8 ("x86: uaccess: don't use runtime-const rewriting in modules")
> https://git.kernel.org/cgit/linux/kernel/git/torvalds/linux.git master
>
>
> testcase: stress-ng
> config: x86_64-rhel-9.4
> compiler: gcc-14
> test machine: 224 threads 2 sockets Intel(R) Xeon(R) Platinum 8480CTDX (Sapphire Rapids) with 256G memory
> parameters:
>
> nr_threads: 100%
> testtime: 60s
> test: sockfd
> cpufreq_governor: performance
>
>
>
> Details are as below:
> -------------------------------------------------------------------------------------------------->
>
>
> The kernel config and materials to reproduce are available at:
> https://download.01.org/0day-ci/archive/20251128/202511281306.51105b46-lkp@intel.com
>
> =========================================================================================
> compiler/cpufreq_governor/kconfig/nr_threads/rootfs/tbox_group/test/testcase/testtime:
> gcc-14/performance/x86_64-rhel-9.4/100%/debian-13-x86_64-20250902.cgz/lkp-spr-r02/sockfd/stress-ng/60s
>
> commit:
> 17d85f33a8 ("Merge tag 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/rdma/rdma")
> 284922f4c5 ("x86: uaccess: don't use runtime-const rewriting in modules")
>
> 17d85f33a83b84e7 284922f4c563aa3a8558a00f2a0
> ---------------- ---------------------------
> %stddev %change %stddev
> \ | \
> 55674763 +6.1% 59075135 stress-ng.sockfd.ops
> 927326 +6.1% 983845 stress-ng.sockfd.ops_per_sec
> 3555 ± 3% +10.6% 3932 ± 3% perf-c2c.DRAM.remote
> 4834 ± 3% +12.0% 5415 ± 3% perf-c2c.HITM.local
> 2714 ± 2% +12.5% 3054 ± 3% perf-c2c.HITM.remote
> 0.51 +3.9% 0.53 perf-stat.i.MPKI
> 34903541 +5.2% 36715161 perf-stat.i.cache-misses
> 1.072e+08 +5.8% 1.133e+08 perf-stat.i.cache-references
> 18971 -5.5% 17932 perf-stat.i.cycles-between-cache-misses
> 0.46 ± 30% +13.6% 0.52 perf-stat.overall.MPKI
> 31330827 ± 30% +14.9% 36004895 perf-stat.ps.cache-misses
> 96530576 ± 30% +15.3% 1.113e+08 perf-stat.ps.cache-references
> 48.32 -0.2 48.16 perf-profile.calltrace.cycles-pp._raw_spin_lock.unix_del_edges.unix_stream_read_generic.unix_stream_recvmsg.sock_recvmsg
> 48.23 -0.2 48.07 perf-profile.calltrace.cycles-pp.native_queued_spin_lock_slowpath._raw_spin_lock.unix_del_edges.unix_stream_read_generic.unix_stream_recvmsg
> 48.34 -0.2 48.18 perf-profile.calltrace.cycles-pp.unix_del_edges.unix_stream_read_generic.unix_stream_recvmsg.sock_recvmsg.____sys_recvmsg
> 0.56 ± 4% +0.1 0.65 ± 9% perf-profile.calltrace.cycles-pp.path_openat.do_filp_open.do_sys_openat2.__x64_sys_openat.do_syscall_64
> 0.62 ± 3% +0.1 0.71 ± 8% perf-profile.calltrace.cycles-pp.do_sys_openat2.__x64_sys_openat.do_syscall_64.entry_SYSCALL_64_after_hwframe.stress_sockfd
> 0.56 ± 3% +0.1 0.65 ± 8% perf-profile.calltrace.cycles-pp.do_filp_open.do_sys_openat2.__x64_sys_openat.do_syscall_64.entry_SYSCALL_64_after_hwframe
> 48.34 -0.2 48.18 perf-profile.children.cycles-pp.unix_del_edges
> 0.15 ± 3% +0.0 0.17 ± 2% perf-profile.children.cycles-pp.__scm_recv_common
> 0.08 ± 7% +0.0 0.10 ± 7% perf-profile.children.cycles-pp.lockref_put_return
> 0.09 ± 5% +0.0 0.11 ± 6% perf-profile.children.cycles-pp.__fput
> 0.35 ± 5% +0.1 0.43 ± 12% perf-profile.children.cycles-pp.do_open
> 0.63 ± 3% +0.1 0.72 ± 8% perf-profile.children.cycles-pp.do_sys_openat2
> 0.56 ± 3% +0.1 0.65 ± 8% perf-profile.children.cycles-pp.do_filp_open
>
While this may read suspicious as the change is supposed to be a nop
for core kernel, it in fact is not as it adds:
/* Used for modules: built-in code uses runtime constants */
+unsigned long USER_PTR_MAX;
+EXPORT_SYMBOL(USER_PTR_MAX);
this should probably be __ro_after_init.
The test at hand is heavily bottlenecked on the global lock in the
garbage collector, which is not annotated with anything.
On my kernel I see this (nm vmlinux | sort -nk 1):
ffffffff846c0a20 b bsd_socket_locks
ffffffff846c0e20 b bsd_socket_buckets
ffffffff846c1620 b unix_nr_socks
ffffffff846c1628 b gc_in_progress
ffffffff846c1630 b unix_graph_cyclic_sccs
ffffffff846c1638 b unix_gc_lock <--- THE LOCK
ffffffff846c1640 b unix_vertex_unvisited_index
ffffffff846c1648 b unix_graph_state
ffffffff846c1660 b unix_stream_bpf_prot
ffffffff846c1820 b unix_stream_prot_lock
ffffffff846c1840 b unix_dgram_bpf_prot
ffffffff846c1a00 b unix_dgram_prot_lock
note how bsd_socket_buckets looks suspicious in its own right, but
ignoring that bit, I'm guessing things got pushed around and it
changed some of cacheline bouncing.
while a full fix is beyond the scope of this patch(tm), perhaps the
annotation below will stabilize it against random breakage. can you
guys bench it.
diff --git a/net/unix/garbage.c b/net/unix/garbage.c
index 78323d43e63e..25f65817faab 100644
--- a/net/unix/garbage.c
+++ b/net/unix/garbage.c
@@ -199,7 +199,7 @@ static void unix_free_vertices(struct scm_fp_list *fpl)
}
}
-static DEFINE_SPINLOCK(unix_gc_lock);
+static __cacheline_aligned_in_smp DEFINE_SPINLOCK(unix_gc_lock);
void unix_add_edges(struct scm_fp_list *fpl, struct unix_sock *receiver)
{
Powered by blists - more mailing lists