[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Message-ID: <abbab9de-bafd-4f40-a52e-07c8f8acc128@linux.ibm.com>
Date: Mon, 2 Dec 2024 00:58:59 +0530
From: Shrikanth Hegde <sshegde@...ux.ibm.com>
To: Christophe Leroy <christophe.leroy@...roup.eu>,
Luming Yu <luming.yu@...ngroup.cn>
Cc: npiggin@...il.com, maddy@...ux.ibm.com, bigeasy@...utronix.de,
ankur.a.arora@...cle.com, linux-kernel@...r.kernel.org,
mpe@...erman.id.au, linuxppc-dev@...ts.ozlabs.org,
Thomas Gleixner <tglx@...utronix.de>
Subject: Re: [PATCH v2 1/2] powerpc: Add preempt lazy support
On 11/26/24 16:23, Christophe Leroy wrote:
>
>
> Le 16/11/2024 à 20:23, Shrikanth Hegde a écrit :
>> Define preempt lazy bit for Powerpc. Use bit 9 which is free and within
>> 16 bit range of NEED_RESCHED, so compiler can issue single andi.
>>
>> Since Powerpc doesn't use the generic entry/exit, add lazy check at exit
>> to user. CONFIG_PREEMPTION is defined for lazy/full/rt so use it for
>> return to kernel.
>
> FWIW, there is work in progress on using generic entry/exit for powerpc,
> if you can help testing it that can help, see https://
> patchwork.ozlabs.org/project/linuxppc-dev/patch/
> F0AE0A4571CE3126+20241111031934.1579-2-luming.yu@...ngroup.cn/
>
I gave that series a try. After a lot of manual patching on tip tree and
removal of multiple definition of regs_irqs_disabled, i was able to
compile and boot.
I am skimming through the series, but as far as i understand from the
comments, it needs to be redesigned. I probably get it why. I will go
through it in more detail to figure out how to do it better. I believe
the changes have to stem from interrupt_64.S
As far as the bits of this patch with that patch concerned, it is with
NEED_RESCHED bits. There atleast couple of major issues in that patch
series that are wrong.
1. It only tries to move exit to user to generic. exit to kernel is not.
there is generic irqentry_exit that handles it for generic code. powerpc
exit to kernel still there.
2. Even for exit to user, it ends up calling exit_to_user_mode_loop
twice for the same syscall. that seems wrong. once in
interrupt_exit_user_prepare_main and once through
syscall_exit_to_user_mode in syscall_exit_prepare.
> Christophe
>
So I guess, when we do eventually if move, this checks would be removed
at that point along with rest of the code.
>
>>
>> Ran a few benchmarks and db workload on Power10. Performance is close to
>> preempt=none/voluntary.
>> Since Powerpc systems can have large core count and large memory,
>> preempt lazy is going to be helpful in avoiding soft lockup issues.
>>
>> Reviewed-by: Sebastian Andrzej Siewior <bigeasy@...utronix.de>
>> Reviewed-by: Ankur Arora <ankur.a.arora@...cle.com>
>> Signed-off-by: Shrikanth Hegde <sshegde@...ux.ibm.com>
>> ---
>> arch/powerpc/Kconfig | 1 +
>> arch/powerpc/include/asm/thread_info.h | 9 ++++++---
>> arch/powerpc/kernel/interrupt.c | 4 ++--
>> 3 files changed, 9 insertions(+), 5 deletions(-)
>>
>> diff --git a/arch/powerpc/Kconfig b/arch/powerpc/Kconfig
>> index 8094a01974cc..2f625aecf94b 100644
>> --- a/arch/powerpc/Kconfig
>> +++ b/arch/powerpc/Kconfig
>> @@ -145,6 +145,7 @@ config PPC
>> select ARCH_HAS_NON_OVERLAPPING_ADDRESS_SPACE
>> select ARCH_HAS_PHYS_TO_DMA
>> select ARCH_HAS_PMEM_API
>> + select ARCH_HAS_PREEMPT_LAZY
>> select ARCH_HAS_PTE_DEVMAP if PPC_BOOK3S_64
>> select ARCH_HAS_PTE_SPECIAL
>> select ARCH_HAS_SCALED_CPUTIME if
>> VIRT_CPU_ACCOUNTING_NATIVE && PPC_BOOK3S_64
>> diff --git a/arch/powerpc/include/asm/thread_info.h b/arch/powerpc/
>> include/asm/thread_info.h
>> index 6ebca2996f18..2785c7462ebf 100644
>> --- a/arch/powerpc/include/asm/thread_info.h
>> +++ b/arch/powerpc/include/asm/thread_info.h
>> @@ -103,6 +103,7 @@ void arch_setup_new_exec(void);
>> #define TIF_PATCH_PENDING 6 /* pending live patching update */
>> #define TIF_SYSCALL_AUDIT 7 /* syscall auditing active */
>> #define TIF_SINGLESTEP 8 /* singlestepping active */
>> +#define TIF_NEED_RESCHED_LAZY 9 /* Scheduler driven lazy
>> preemption */
>> #define TIF_SECCOMP 10 /* secure computing */
>> #define TIF_RESTOREALL 11 /* Restore all regs (implies
>> NOERROR) */
>> #define TIF_NOERROR 12 /* Force successful syscall return */
>> @@ -122,6 +123,7 @@ void arch_setup_new_exec(void);
>> #define _TIF_SYSCALL_TRACE (1<<TIF_SYSCALL_TRACE)
>> #define _TIF_SIGPENDING (1<<TIF_SIGPENDING)
>> #define _TIF_NEED_RESCHED (1<<TIF_NEED_RESCHED)
>> +#define _TIF_NEED_RESCHED_LAZY (1<<TIF_NEED_RESCHED_LAZY)
>> #define _TIF_NOTIFY_SIGNAL (1<<TIF_NOTIFY_SIGNAL)
>> #define _TIF_POLLING_NRFLAG (1<<TIF_POLLING_NRFLAG)
>> #define _TIF_32BIT (1<<TIF_32BIT)
>> @@ -142,9 +144,10 @@ void arch_setup_new_exec(void);
>> _TIF_SYSCALL_EMU)
>> #define _TIF_USER_WORK_MASK (_TIF_SIGPENDING | _TIF_NEED_RESCHED | \
>> - _TIF_NOTIFY_RESUME | _TIF_UPROBE | \
>> - _TIF_RESTORE_TM | _TIF_PATCH_PENDING | \
>> - _TIF_NOTIFY_SIGNAL)
>> + _TIF_NEED_RESCHED_LAZY | _TIF_NOTIFY_RESUME | \
>> + _TIF_UPROBE | _TIF_RESTORE_TM | \
>> + _TIF_PATCH_PENDING | _TIF_NOTIFY_SIGNAL)
>> +
>> #define _TIF_PERSYSCALL_MASK (_TIF_RESTOREALL|_TIF_NOERROR)
>> /* Bits in local_flags */
>> diff --git a/arch/powerpc/kernel/interrupt.c b/arch/powerpc/kernel/
>> interrupt.c
>> index af62ec974b97..8f4acc55407b 100644
>> --- a/arch/powerpc/kernel/interrupt.c
>> +++ b/arch/powerpc/kernel/interrupt.c
>> @@ -185,7 +185,7 @@ interrupt_exit_user_prepare_main(unsigned long
>> ret, struct pt_regs *regs)
>> ti_flags = read_thread_flags();
>> while (unlikely(ti_flags & (_TIF_USER_WORK_MASK &
>> ~_TIF_RESTORE_TM))) {
>> local_irq_enable();
>> - if (ti_flags & _TIF_NEED_RESCHED) {
>> + if (ti_flags & (_TIF_NEED_RESCHED | _TIF_NEED_RESCHED_LAZY)) {
>> schedule();
>> } else {
>> /*
>> @@ -396,7 +396,7 @@ notrace unsigned long
>> interrupt_exit_kernel_prepare(struct pt_regs *regs)
>> /* Returning to a kernel context with local irqs enabled. */
>> WARN_ON_ONCE(!(regs->msr & MSR_EE));
>> again:
>> - if (IS_ENABLED(CONFIG_PREEMPT)) {
>> + if (IS_ENABLED(CONFIG_PREEMPTION)) {
>> /* Return to preemptible kernel context */
>> if (unlikely(read_thread_flags() & _TIF_NEED_RESCHED)) {
>> if (preempt_count() == 0)
Powered by blists - more mailing lists