[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <7b3a0288-20f9-42cf-af81-e10ad2d04b27@gmail.com>
Date: Fri, 21 Mar 2025 08:01:52 -0500
From: Carlos Bilbao <carlos.bilbao.osdev@...il.com>
To: pmladek@...e.com, Andrew Morton <akpm@...ux-foundation.org>,
jani.nikula@...el.com, open list <linux-kernel@...r.kernel.org>,
Greg Kroah-Hartman <gregkh@...uxfoundation.org>,
Thomas Gleixner <tglx@...utronix.de>, takakura@...inux.co.jp,
john.ogness@...utronix.de
Cc: jglauber@...italocean.com
Subject: [RFC] panic: reduce CPU consumption when finished handling panic
Hello again,
I thought it would be helpful to share some numbers to support my claim
and a couple ideas to improve the patch. Below are the perf stats from
the hypervisor after triggering a panic on a guest running kernel v5.15
(I'll provide the details of the experiment afterward.)
Samples: 55K of event 'cycles:P', Event count (approx.): 36090772574
Overhead Command Shared Object Symbol
42.20% CPU 5/KVM [kernel.kallsyms] [k] vmx_vmexit
19.07% CPU 5/KVM [kernel.kallsyms] [k] vmx_spec_ctrl_restore_host
9.73% CPU 5/KVM [kernel.kallsyms] [k] vmx_vcpu_enter_exit
3.60% CPU 5/KVM [kernel.kallsyms] [k] __flush_smp_call_function_queue
2.91% CPU 5/KVM [kernel.kallsyms] [k] vmx_vcpu_run
2.85% CPU 5/KVM [kernel.kallsyms] [k] native_irq_return_iret
2.67% CPU 5/KVM [kernel.kallsyms] [k] native_flush_tlb_one_user
2.16% CPU 5/KVM [kernel.kallsyms] [k] llist_reverse_order
2.10% CPU 5/KVM [kernel.kallsyms] [k] __srcu_read_lock
2.08% CPU 5/KVM [kernel.kallsyms] [k] flush_tlb_func
1.52% CPU 5/KVM [kernel.kallsyms] [k] vcpu_enter_guest.constprop.0
1.50% CPU 5/KVM [kernel.kallsyms] [k] native_apic_msr_eoi
1.01% CPU 5/KVM [kernel.kallsyms] [k] clear_bhb_loop
0.66% CPU 5/KVM [kernel.kallsyms] [k] sysvec_call_function_single
And here are the results from the guest VM after applying my patch:
Samples: 28 of event 'cycles:P', Event count (approx.): 28961952
Overhead Command Shared Object Symbol
11.03% qemu-system-x86 [kernel.kallsyms] [k] task_mm_cid_work
11.03% qemu-system-x86 qemu-system-x86_64 [.] 0x0000000000579944
9.80% qemu-system-x86 qemu-system-x86_64 [.] 0x000000000056512b
8.45% IO mon_iothread libc.so.6 [.] 0x00000000000a3f12
8.45% IO mon_iothread libglib-2.0.so.0.7200.4 [.] g_mutex_lock
7.51% IO mon_iothread [kernel.kallsyms] [k] avg_vruntime
6.65% IO mon_iothread libc.so.6 [.] write
5.93% IO mon_iothread [kernel.kallsyms] [k] security_file_permission
4.97% qemu-system-x86 libglib-2.0.so.0.7200.4 [.] g_thread_self
4.64% IO mon_iothread [kernel.kallsyms] [k] aa_label_sk_perm.part.0
4.13% IO mon_iothread libglib-2.0.so.0.7200.4 [.] g_main_context_release
3.79% IO mon_iothread [kernel.kallsyms] [k] seccomp_run_filters
3.42% IO mon_iothread libglib-2.0.so.0.7200.4 [.] g_main_context_dispatch
3.42% IO mon_iothread qemu-system-x86_64 [.] 0x00000000004edbab
3.28% IO mon_iothread qemu-system-x86_64 [.] 0x00000000005999c8
3.09% IO mon_iothread qemu-system-x86_64 [.] 0x00000000004e636b
0.22% qemu-system-x86 [kernel.kallsyms] [k] __intel_pmu_enable_all.constprop.0
As you can see, CPU consumption is significantly reduced after applying the
proposed change during panic, with KVM-related functions (e.g.,
vmx_vmexit) dropping from more than 70% of CPU usage to virtually nothing.
Also, the num of samples decreased from 55K to 28, and the event count
dropped from 36.09 billion to 28.96 million.
Jan suggested that a better way to implement cpu_halt_end_panic() (perhaps
cpu_halt_after_panic() is a better name) would be to define it as a weak
function in asm-generic, allowing archs to overwrite it. What do you think?
Thank you in advance!
Regards,
Carlos
---
Details on the experiment:
- Linux kernel v5.15 (commit 8bb7eca)
- VM guest CPU: Intel(R) Xeon(R) Gold 5318Y CPU @ 2.10GHz
- I executed to collect samples:
/usr/bin/perf record -p 2618527 -a sleep 30
- Image Ubuntu 22.04 (LTS) x64, 8 vCPUs, 16GB / 100GB Disk
Thanks,
Carlos
On 3/17/25 17:01, Carlos Bilbao wrote:
> After the kernel has finished handling a panic, it enters a busy-wait loop.
> But, this unnecessarily consumes CPU power and electricity. Plus, in VMs,
> this negatively impacts the throughput of other VM guests running on the
> same hypervisor.
>
> I propose introducing a function cpu_halt_end_panic() to halt the CPU
> during this state while still allowing interrupts to be processed. See my
> commit below.
>
> Thanks in advance!
>
> Signed-off-by: Carlos Bilbao <carlos.bilbao@...nel.org>
> ---
> kernel/panic.c | 17 ++++++++++++++++-
> 1 file changed, 16 insertions(+), 1 deletion(-)
>
> diff --git a/kernel/panic.c b/kernel/panic.c
> index fbc59b3b64d0..c00ccaa698d5 100644
> --- a/kernel/panic.c
> +++ b/kernel/panic.c
> @@ -276,6 +276,21 @@ static void panic_other_cpus_shutdown(bool crash_kexec)
> crash_smp_send_stop();
> }
>
> +static void cpu_halt_end_panic(void)
> +{
> +#ifdef CONFIG_X86
> + native_safe_halt();
> +#elif defined(CONFIG_ARM)
> + cpu_do_idle();
> +#else
> + /*
> + * Default to a simple busy-wait if no architecture-specific halt is
> + * defined above
> + */
> + mdelay(PANIC_TIMER_STEP);
> +#endif
> +}
> +
> /**
> * panic - halt the system
> * @fmt: The text string to print
> @@ -474,7 +489,7 @@ void panic(const char *fmt, ...)
> i += panic_blink(state ^= 1);
> i_next = i + 3600 / PANIC_BLINK_SPD;
> }
> - mdelay(PANIC_TIMER_STEP);
> + cpu_halt_end_panic();
> }
> }
>
Powered by blists - more mailing lists