[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <87le4kyj8t.fsf@oracle.com>
Date: Tue, 07 May 2024 22:18:10 -0700
From: Ankur Arora <ankur.a.arora@...cle.com>
To: Shrikanth Hegde <sshegde@...ux.ibm.com>
Cc: Ankur Arora <ankur.a.arora@...cle.com>,
Michael Ellerman
<mpe@...erman.id.au>,
Nicholas Piggin <npiggin@...il.com>,
Linus Torvalds
<torvalds@...ux-foundation.org>,
Thomas Gleixner <tglx@...utronix.de>, peterz@...radead.org,
paulmck@...nel.org, akpm@...ux-foundation.org, luto@...nel.org,
bp@...en8.de, dave.hansen@...ux.intel.com, hpa@...or.com,
mingo@...hat.com, juri.lelli@...hat.com, vincent.guittot@...aro.org,
willy@...radead.org, mgorman@...e.de, jpoimboe@...nel.org,
mark.rutland@....com, jgross@...e.com, andrew.cooper3@...rix.com,
bristot@...nel.org, mathieu.desnoyers@...icios.com,
geert@...ux-m68k.org, glaubitz@...sik.fu-berlin.de,
anton.ivanov@...bridgegreys.com, mattst88@...il.com,
krypton@...ich-teichert.org, rostedt@...dmis.org,
David.Laight@...lab.com, richard@....at, mjguzik@...il.com,
jon.grimm@....com, bharata@....com, raghavendra.kt@....com,
boris.ostrovsky@...cle.com, konrad.wilk@...cle.com,
LKML
<linux-kernel@...r.kernel.org>
Subject: Re: [PATCH 00/30] PREEMPT_AUTO: support lazy rescheduling
Shrikanth Hegde <sshegde@...ux.ibm.com> writes:
> On 4/27/24 12:30 AM, Ankur Arora wrote:
>>
>
> Hi Ankur,
>
> Sorry for the delay, I was on leave last week.
> There might be a delay in response as I am recovering from fever.
Please take your your time.
>> Great. I'm guessing these tests are when running in voluntary preemption
>> mode (under PREEMPT_AUTO).
>>
>
> It was run under preempt=none.
>
>> If you haven't, could you also try full preemption? There you should see
>> identical results unless something is horribly wrong.
>
> I tried preempt=full with patch you provided below. ran the hackbench for much longer
> with 100000 loops. I don't see any regression on the larger system.
> I see slight improvement in some cases. I dont see any major regression with 10k ops
> which was tried earlier as well.
Great, so no surprises with preempt=full.
> ==========================================================
> 1L ops.
> ==========================================================
> Process 10 groups : 9.85, 9.87(-0.20)
> Process 20 groups : 17.69, 17.32(2.09)
> Process 30 groups : 25.89, 25.96(-0.27)
> Process 40 groups : 34.70, 34.61(0.26)
> Process 50 groups : 44.02, 43.79(0.52)
> Process 60 groups : 52.72, 52.10(1.18)
> Thread 10 groups : 10.50, 10.52(-0.19)
> Thread 20 groups : 18.79, 18.60(1.01)
> Process(Pipe) 10 groups : 10.39, 10.37(0.19)
> Process(Pipe) 20 groups : 18.45, 18.54(-0.49)
> Process(Pipe) 30 groups : 25.63, 25.92(-1.13)
> Process(Pipe) 40 groups : 33.79, 33.49(0.89)
> Process(Pipe) 50 groups : 43.15, 41.83(3.06)
> Process(Pipe) 60 groups : 51.94, 50.32(3.12)
> Thread(Pipe) 10 groups : 10.73, 10.85(-1.12)
> Thread(Pipe) 20 groups : 19.24, 19.35(-0.57)
I presume the ones on the left are the baseline numbers with
PREEMPT_AUTO on the right?
Also, these are with preempt=none/voluntary/full?
> ==========================================================
> 10k ops.
>
> Process 10 groups : 1.10, 1.10(0.00)
> Process 20 groups : 1.89, 1.88(0.53)
> Process 30 groups : 2.82, 2.80(0.71)
> Process 40 groups : 3.76, 3.76(0.00)
> Process 50 groups : 4.66, 4.79(-2.79)
> Process 60 groups : 5.74, 5.92(-3.14)
> thread 10 groups : 1.22, 1.20(1.64)
> thread 20 groups : 2.05, 2.05(0.00)
> Process(Pipe) 10 groups : 1.13, 1.13(0.00)
> Process(Pipe) 20 groups : 1.98, 1.93(2.53)
> Process(Pipe) 30 groups : 2.91, 2.75(5.50)
> Process(Pipe) 40 groups : 3.85, 3.65(5.19)
> Process(Pipe) 50 groups : 4.91, 4.91(0.00)
> Process(Pipe) 60 groups : 5.56, 5.90(-6.12)
> thread(Pipe) 10 groups : 1.23, 1.23(0.00)
> thread(Pipe) 20 groups : 1.99, 1.99(0.00)
> ==========================================================
>
> Other than hackbench, I see slight improvement in unixbench and stress-ng --cpu workloads.
>>
>>> However, I still see 20-50%
>>> regression on the larger system(320 CPUS). I will continue to debug why.
>>
>> Could you try this patch? This is needed because PREEMPT_AUTO turns on
>> CONFIG_PREEMPTION, but not CONFIG_PREEMPT:
Just wanted to check if the regression you were seeing with preempt=none
is fixed?
> This patch can be considered as the enablement patch for Powerpc for preempt_auto.
> Michael, Nick, Do you see any concerns?
>
> Ankur, Could you please add this patch, if there are no objections.
Of course. Thanks for the patch.
The patch overall looks good. A minor comment below.
Will add it to v2.
> ---
> From 878a5a7c990e3459758a5d19d7697b07d8d27d0f Mon Sep 17 00:00:00 2001
> From: Shrikanth Hegde <sshegde@...ux.ibm.com>
> Date: Tue, 7 May 2024 04:42:04 -0500
> Subject: [PATCH] powerpc: add support for preempt_auto
>
> Add PowerPC arch support for PREEMPT_AUTO by defining LAZY bits.
>
> Since PowerPC doesn't use generic exit to functions, Add
> NR_LAZY check in exit to user and exit to kernel from interrupt
> routines.
>
> Signed-off-by: Shrikanth Hegde <sshegde@...ux.ibm.com>
> ---
> arch/powerpc/Kconfig | 1 +
> arch/powerpc/include/asm/thread_info.h | 11 ++++++++++-
> arch/powerpc/kernel/interrupt.c | 6 ++++--
> 3 files changed, 15 insertions(+), 3 deletions(-)
..
> diff --git a/arch/powerpc/kernel/interrupt.c b/arch/powerpc/kernel/interrupt.c
> index eca293794a1e..0c0b7010995a 100644
> --- a/arch/powerpc/kernel/interrupt.c
> +++ b/arch/powerpc/kernel/interrupt.c
> @@ -185,7 +185,7 @@ interrupt_exit_user_prepare_main(unsigned long ret, struct pt_regs *regs)
> ti_flags = read_thread_flags();
> while (unlikely(ti_flags & (_TIF_USER_WORK_MASK & ~_TIF_RESTORE_TM))) {
> local_irq_enable();
> - if (ti_flags & _TIF_NEED_RESCHED) {
> + if (ti_flags & (_TIF_NEED_RESCHED | _TIF_NEED_RESCHED_LAZY)) {
> schedule();
> } else {
> /*
> @@ -396,7 +396,9 @@ notrace unsigned long interrupt_exit_kernel_prepare(struct pt_regs *regs)
> /* Returning to a kernel context with local irqs enabled. */
> WARN_ON_ONCE(!(regs->msr & MSR_EE));
> again:
> - if (IS_ENABLED(CONFIG_PREEMPT)) {
> +
> + if ((IS_ENABLED(CONFIG_PREEMPT_AUTO) && IS_ENABLED(CONFIG_PREEMPTION)) ||
> + (!IS_ENABLED(CONFIG_PREEMPT_AUTO) && IS_ENABLED(CONFIG_PREEMPT))) {
Seems to me, this could just be:
if (IS_EANBLED(CONFIG_PREEMPTION))
But, maybe the intent is to exclude support for, say
(PREEMPT_DYNAMIC && PREEMPT_NONE), then maybe it is best to explicitly
encode that?
Alternately, a comment on the disparate treatment for PREEMPT_AUTO and
!PREEMPT_AUTO might be helpful.
Thanks
Ankur
> /* Return to preemptible kernel context */
> if (unlikely(read_thread_flags() & _TIF_NEED_RESCHED)) {
> if (preempt_count() == 0)
Powered by blists - more mailing lists