[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <CAMzpN2hbbMdB8bf9deRefvFoQ_iRjB1o9edkgFSZvcjRzsVgdQ@mail.gmail.com>
Date: Wed, 26 Feb 2025 20:29:10 -0500
From: Brian Gerst <brgerst@...il.com>
To: Peter Zijlstra <peterz@...radead.org>
Cc: linux-kernel@...r.kernel.org, x86@...nel.org,
Ingo Molnar <mingo@...nel.org>, "H . Peter Anvin" <hpa@...or.com>, Thomas Gleixner <tglx@...utronix.de>,
Borislav Petkov <bp@...en8.de>, Ard Biesheuvel <ardb@...nel.org>, Uros Bizjak <ubizjak@...il.com>,
Linus Torvalds <torvalds@...uxfoundation.org>, Andy Lutomirski <luto@...nel.org>,
Andrew Morton <akpm@...ux-foundation.org>
Subject: Re: [PATCH v2 00/11] Add a percpu subsection for cache hot data
On Wed, Feb 26, 2025 at 3:23 PM Peter Zijlstra <peterz@...radead.org> wrote:
>
> On Wed, Feb 26, 2025 at 01:05:19PM -0500, Brian Gerst wrote:
> > Add a new percpu subsection for data that is frequently accessed and
> > exclusive to each processor. This replaces the pcpu_hot struct on x86,
> > and is available to all architectures and the core kernel.
> >
> > ffffffff842fa000 D __per_cpu_hot_start
> > ffffffff842fa000 D hardirq_stack_ptr
> > ffffffff842fa008 D __ref_stack_chk_guard
> > ffffffff842fa008 D __stack_chk_guard
> > ffffffff842fa010 D const_cpu_current_top_of_stack
> > ffffffff842fa010 D cpu_current_top_of_stack
> > ffffffff842fa018 D const_current_task
> > ffffffff842fa018 D current_task
> > ffffffff842fa020 D __x86_call_depth
> > ffffffff842fa028 D this_cpu_off
> > ffffffff842fa030 D __preempt_count
> > ffffffff842fa034 D cpu_number
> > ffffffff842fa038 D __softirq_pending
> > ffffffff842fa03a D hardirq_stack_inuse
> > ffffffff842fa040 D __per_cpu_hot_end
>
> The above is useful, but not quite as useful as looking at:
>
> $ pahole -C pcpu_hot defconfig-build/vmlinux.o
> struct pcpu_hot {
> union {
> struct {
> struct task_struct * current_task; /* 0 8 */
> int preempt_count; /* 8 4 */
> int cpu_number; /* 12 4 */
> u64 call_depth; /* 16 8 */
> long unsigned int top_of_stack; /* 24 8 */
> void * hardirq_stack_ptr; /* 32 8 */
> u16 softirq_pending; /* 40 2 */
> bool hardirq_stack_inuse; /* 42 1 */
> }; /* 0 48 */
> u8 pad[64]; /* 0 64 */
> }; /* 0 64 */
>
> /* size: 64, cachelines: 1, members: 1 */
> };
>
> A slightly more useful variant of your listing would be:
>
> $ readelf -Ws defconfig-build/vmlinux | sort -k 2 | awk 'BEGIN {p=0} /__per_cpu_hot_start/ {p=1} { if (p) print $2 " " $3 " " $8 } /__per_cpu_hot_end/ {p=0}'
> ffffffff834f5000 0 __per_cpu_hot_start
> ffffffff834f5000 8 hardirq_stack_ptr
> ffffffff834f5008 0 __ref_stack_chk_guard
> ffffffff834f5008 8 __stack_chk_guard
> ffffffff834f5010 0 const_cpu_current_top_of_stack
> ffffffff834f5010 8 cpu_current_top_of_stack
> ffffffff834f5018 0 const_current_task
> ffffffff834f5018 8 current_task
> ffffffff834f5020 8 __x86_call_depth
> ffffffff834f5028 8 this_cpu_off
> ffffffff834f5030 4 __preempt_count
> ffffffff834f5034 4 cpu_number
> ffffffff834f5038 2 __softirq_pending
> ffffffff834f503a 1 hardirq_stack_inuse
> ffffffff834f5040 0 __per_cpu_hot_end
>
> as it also gets the size for each symbol. Allowing us to compute the
> hole as 0x40-0x3b, or 5 bytes.
If all the variables in this section are scalar or pointer types,
SORT_BY_ALIGNMENT() should result in no padding between them. I can
add a __per_cpu_hot_pad symbol to show the actual end of the data
(not aligned to the next cacheline like __per_cpu_hot_end).
Brian Gerst
Powered by blists - more mailing lists