[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <01bc9e5c-fd47-4c26-8af3-59f10f3b9c76@csgroup.eu>
Date: Fri, 31 Jan 2025 07:09:02 +0100
From: Christophe Leroy <christophe.leroy@...roup.eu>
To: Sebastian Andrzej Siewior <bigeasy@...utronix.de>,
Shrikanth Hegde <sshegde@...ux.ibm.com>
Cc: mpe@...erman.id.au, maddy@...ux.ibm.com, linuxppc-dev@...ts.ozlabs.org,
npiggin@...il.com, linux-kernel@...r.kernel.org
Subject: Re: [PATCH v3 1/1] powerpc: Enable dynamic preemption
Le 30/01/2025 à 21:26, Sebastian Andrzej Siewior a écrit :
> On 2025-01-30 22:27:07 [+0530], Shrikanth Hegde wrote:
>>> | #DEFINE need_irq_preemption() \
>>> | (static_branch_unlikely(&sk_dynamic_irqentry_exit_cond_resched))
>>> |
>>> | if (need_irq_preemption()) {
>>>
>>> be a bit smaller/ quicker? This could be a fast path ;)
>>
>> I am okay with either way. I did try both[1], there wasn't any significant difference,
>> hence chose a simpler one. May be system size, workload pattern might matter.
>>
>> Let me do some more testing to see which one wins.
>> Is there any specific benchmark which might help here?
>
> No idea. As per bean counting: preempt_model_preemptible() should
> resolve in two function calls + conditional in the dynamic case. This
> should be more expensive compared to a nop/ branch ;)
I did a build comparison on pmac32_defconfig + dynamic preemption, the
static branch looks definitely more optimal:
Without static branch:
...
294: 7c 08 02 a6 mflr r0
298: 90 01 00 24 stw r0,36(r1)
29c: 48 00 00 01 bl 29c <interrupt_exit_kernel_prepare+0x50>
29c: R_PPC_REL24 preempt_model_full
2a0: 2c 03 00 00 cmpwi r3,0
2a4: 41 82 00 a4 beq 348 <interrupt_exit_kernel_prepare+0xfc>
...
2a8: 81 22 00 4c lwz r9,76(r2)
2ac: 71 29 00 04 andi. r9,r9,4
2b0: 40 82 00 d4 bne 384 <interrupt_exit_kernel_prepare+0x138>
...
2b4: 7d 20 00 a6 mfmsr r9
2b8: 55 29 07 fa rlwinm r9,r9,0,31,29
...
348: 48 00 00 01 bl 348 <interrupt_exit_kernel_prepare+0xfc>
348: R_PPC_REL24 preempt_model_lazy
34c: 2c 03 00 00 cmpwi r3,0
350: 40 a2 ff 58 bne 2a8 <interrupt_exit_kernel_prepare+0x5c>
354: 4b ff ff 60 b 2b4 <interrupt_exit_kernel_prepare+0x68>
With the static branch:
...
294: 48 00 00 58 b 2ec <interrupt_exit_kernel_prepare+0xa0>
...
298: 7d 20 00 a6 mfmsr r9
29c: 55 29 07 fa rlwinm r9,r9,0,31,29
...
2ec: 81 22 00 4c lwz r9,76(r2)
2f0: 71 29 00 04 andi. r9,r9,4
2f4: 41 82 ff a4 beq 298 <interrupt_exit_kernel_prepare+0x4c>
...
With the static branch it is just an unconditional branch being later
replaced by a NOP. It must be more efficient.
So in first case we need to get and save LR, call preempt_model_full(),
compare with 0, call preempt_model_lazy() and compare again, and at some
point read and restore LR.
And the preempt_model_() functions are not tiny functions:
00003620 <preempt_model_full>:
3620: 3d 20 00 00 lis r9,0
3622: R_PPC_ADDR16_HA preempt_dynamic_mode
3624: 94 21 ff f0 stwu r1,-16(r1)
3628: 80 69 00 00 lwz r3,0(r9)
362a: R_PPC_ADDR16_LO preempt_dynamic_mode
362c: 2c 03 ff ff cmpwi r3,-1
3630: 41 82 00 18 beq 3648 <preempt_model_full+0x28>
3634: 68 63 00 02 xori r3,r3,2
3638: 38 21 00 10 addi r1,r1,16
363c: 7c 63 00 34 cntlzw r3,r3
3640: 54 63 d9 7e srwi r3,r3,5
3644: 4e 80 00 20 blr
...
3648: 0f e0 00 00 twui r0,0
364c: 68 63 00 02 xori r3,r3,2
3650: 38 21 00 10 addi r1,r1,16
3654: 7c 63 00 34 cntlzw r3,r3
3658: 54 63 d9 7e srwi r3,r3,5
365c: 4e 80 00 20 blr
00003660 <preempt_model_lazy>:
3660: 3d 20 00 00 lis r9,0
3662: R_PPC_ADDR16_HA preempt_dynamic_mode
3664: 94 21 ff f0 stwu r1,-16(r1)
3668: 80 69 00 00 lwz r3,0(r9)
366a: R_PPC_ADDR16_LO preempt_dynamic_mode
366c: 2c 03 ff ff cmpwi r3,-1
3670: 41 82 00 18 beq 3688 <preempt_model_lazy+0x28>
3674: 68 63 00 03 xori r3,r3,3
3678: 38 21 00 10 addi r1,r1,16
367c: 7c 63 00 34 cntlzw r3,r3
3680: 54 63 d9 7e srwi r3,r3,5
3684: 4e 80 00 20 blr
...
3688: 0f e0 00 00 twui r0,0
368c: 68 63 00 03 xori r3,r3,3
3690: 38 21 00 10 addi r1,r1,16
3694: 7c 63 00 34 cntlzw r3,r3
3698: 54 63 d9 7e srwi r3,r3,5
369c: 4e 80 00 20 blr
> But you would still need preempt_model_preemptible() for the !DYN case.
>
>>>> + preempt_model_voluntary() ? "voluntary" :
>>>> + preempt_model_full() ? "full" :
>>>> + preempt_model_lazy() ? "lazy" :
>>>> + "",
>>>
>>> So intend to rework this part. I have patches stashed at
>>> https://eur01.safelinks.protection.outlook.com/?url=https%3A%2F%2Fgit.kernel.org%2Fpub%2Fscm%2Flinux%2Fkernel%2Fgit%2Fbigeasy%2Fstaging.git%2Flog%2F%3Fh%3Dpreemption_string&data=05%7C02%7Cchristophe.leroy%40csgroup.eu%7C553a8f640a9c4514597d08dd416c6534%7C8b87af7d86474dc78df45f69a2011bb5%7C0%7C0%7C638738655988556429%7CUnknown%7CTWFpbGZsb3d8eyJFbXB0eU1hcGkiOnRydWUsIlYiOiIwLjAuMDAwMCIsIlAiOiJXaW4zMiIsIkFOIjoiTWFpbCIsIldUIjoyfQ%3D%3D%7C0%7C%7C%7C&sdata=ZfVQi1iRYaVCTrZ5vIhOQ7yAKaDnOwFi0NRjycfrI5A%3D&reserved=0
>>>
>>> which I didn't sent yet due to the merge window. Just a heads up ;)
>>
>> Makes sense. I had seen at-least two places where this code was there, ftrace/powerpc.
>> There were way more places..
>>
>> You want me to remove this part?
>
> No, just be aware.
> I don't know how this will be routed I guess we merge the sched pieces
> first and then I submit the other pieces via the relevant maintainer
> tree. In that case please be aware that all parts get removed/ replaced
> properly.
>
> Sebastian
Powered by blists - more mailing lists