linux-kernel - Re: [PATCH v2] Reorder some fields in struct rq.

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <affdc6b1-9980-44d1-89db-d90730c1e384@linux.ibm.com>
Date: Wed, 13 Aug 2025 13:00:30 +0530
From: Madadi Vineeth Reddy <vineethr@...ux.ibm.com>
To: Blake Jones <blakejones@...gle.com>, Ingo Molnar <mingo@...hat.com>,
        Peter Zijlstra <peterz@...radead.org>,
        Juri Lelli <juri.lelli@...hat.com>,
        Vincent Guittot <vincent.guittot@...aro.org>
Cc: Josh Don <joshdon@...gle.com>,
        Dietmar Eggemann
 <dietmar.eggemann@....com>,
        Steven Rostedt <rostedt@...dmis.org>, Ben Segall <bsegall@...gle.com>,
        Mel Gorman <mgorman@...e.de>, Valentin Schneider <vschneid@...hat.com>,
        linux-kernel@...r.kernel.org,
        Madadi Vineeth Reddy <vineethr@...ux.ibm.com>
Subject: Re: [PATCH v2] Reorder some fields in struct rq.

Hi Blake,

On 31/07/25 02:26, Blake Jones wrote:
> This colocates some hot fields in "struct rq" to be on the same cache line
> as others that are often accessed at the same time or in similar ways.
> 

[..snip..]

> 
> This patch does not change the size of "struct rq" on machines with 64-byte
> cache lines. The additional "____cacheline_aligned" to put the runqueue
> lock on the next cache line will add an additional 64 bytes of padding on
> machines with 128-byte cache lines; although this is unfortunate, it seemed
> more likely to lead to stably good performance than e.g. by just putting
> the runqueue lock somewhere in the middle of the structure and hoping it
> wasn't on an otherwise busy cache line.

This change introduced an 88 byte hole due to having __lock in a different
cache line on Power11 which is 128 byte architecture which led to one cacheline
more than before.

Tested with your custom test case (thanks for sharing) and observed around
~5% decrease in the number of cycles, along with a slight increase in user
time — both are positive indicators.

Also ran ebizzy, which doesn’t seem to be impacted. I think it would be good
to run a set of standard benchmarks like schbench, ebizzy, hackbench, and
stress-ng, along with a real-life workload, to ensure there’s no negative
impact. I saw that hackbench was tried, but including those numbers would
be helpful.

Reviewed-by: Madadi Vineeth Reddy <vineethr@...ux.ibm.com>
Tested-by: Madadi Vineeth Reddy <vineethr@...ux.ibm.com>

Thanks,
Madadi Vineeth Reddy

> 
> I ran "hackbench" to test this change, but it didn't show very conclusive
> results.  Looking at a profile of the hackbench run, it was spending 95% of
> its cycles inside __alloc_skb(), __kfree_skb(), or kmem_cache_free() -
> almost all of which was spent updating memcg counters or contending on the
> list_lock in kmem_cache_node. In contrast, it spent less than 0.5% of its
> cycles inside either schedule() or try_to_wake_up(). So it's not surprising
> that it didn't show useful results here.
> 

[..snip..]

> @@ -1182,8 +1199,6 @@ struct rq {
>  	struct root_domain		*rd;
>  	struct sched_domain __rcu	*sd;
>  
> -	unsigned long		cpu_capacity;
> -
>  	struct balance_callback *balance_callback;
>  
>  	unsigned char		nohz_idle_balance;