lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <01bc9e5c-fd47-4c26-8af3-59f10f3b9c76@csgroup.eu>
Date: Fri, 31 Jan 2025 07:09:02 +0100
From: Christophe Leroy <christophe.leroy@...roup.eu>
To: Sebastian Andrzej Siewior <bigeasy@...utronix.de>,
 Shrikanth Hegde <sshegde@...ux.ibm.com>
Cc: mpe@...erman.id.au, maddy@...ux.ibm.com, linuxppc-dev@...ts.ozlabs.org,
 npiggin@...il.com, linux-kernel@...r.kernel.org
Subject: Re: [PATCH v3 1/1] powerpc: Enable dynamic preemption



Le 30/01/2025 à 21:26, Sebastian Andrzej Siewior a écrit :
> On 2025-01-30 22:27:07 [+0530], Shrikanth Hegde wrote:
>>> | #DEFINE need_irq_preemption() \
>>> |          (static_branch_unlikely(&sk_dynamic_irqentry_exit_cond_resched))
>>> |
>>> | 	if (need_irq_preemption()) {
>>>
>>> be a bit smaller/ quicker? This could be a fast path ;)
>>
>> I am okay with either way. I did try both[1], there wasn't any significant difference,
>> hence chose a simpler one. May be system size, workload pattern might matter.
>>
>> Let me do some more testing to see which one wins.
>> Is there any specific benchmark which might help here?
> 
> No idea. As per bean counting: preempt_model_preemptible() should
> resolve in two function calls + conditional in the dynamic case. This
> should be more expensive compared to a nop/ branch ;)

I did a build comparison on pmac32_defconfig + dynamic preemption, the 
static branch looks definitely more optimal:

Without static branch:

...
  294:	7c 08 02 a6 	mflr    r0
  298:	90 01 00 24 	stw     r0,36(r1)
  29c:	48 00 00 01 	bl      29c <interrupt_exit_kernel_prepare+0x50>
			29c: R_PPC_REL24	preempt_model_full
  2a0:	2c 03 00 00 	cmpwi   r3,0
  2a4:	41 82 00 a4 	beq     348 <interrupt_exit_kernel_prepare+0xfc>
...
  2a8:	81 22 00 4c 	lwz     r9,76(r2)
  2ac:	71 29 00 04 	andi.   r9,r9,4
  2b0:	40 82 00 d4 	bne     384 <interrupt_exit_kernel_prepare+0x138>
...
  2b4:	7d 20 00 a6 	mfmsr   r9
  2b8:	55 29 07 fa 	rlwinm  r9,r9,0,31,29
...
  348:	48 00 00 01 	bl      348 <interrupt_exit_kernel_prepare+0xfc>
			348: R_PPC_REL24	preempt_model_lazy
  34c:	2c 03 00 00 	cmpwi   r3,0
  350:	40 a2 ff 58 	bne     2a8 <interrupt_exit_kernel_prepare+0x5c>
  354:	4b ff ff 60 	b       2b4 <interrupt_exit_kernel_prepare+0x68>

With the static branch:

...
  294:	48 00 00 58 	b       2ec <interrupt_exit_kernel_prepare+0xa0>
...
  298:	7d 20 00 a6 	mfmsr   r9
  29c:	55 29 07 fa 	rlwinm  r9,r9,0,31,29
...
  2ec:	81 22 00 4c 	lwz     r9,76(r2)
  2f0:	71 29 00 04 	andi.   r9,r9,4
  2f4:	41 82 ff a4 	beq     298 <interrupt_exit_kernel_prepare+0x4c>
...

With the static branch it is just an unconditional branch being later 
replaced by a NOP. It must be more efficient.

So in first case we need to get and save LR, call preempt_model_full(), 
compare with 0, call preempt_model_lazy() and compare again, and at some 
point read and restore LR.

And the preempt_model_() functions are not tiny functions:

00003620 <preempt_model_full>:
     3620:	3d 20 00 00 	lis     r9,0
			3622: R_PPC_ADDR16_HA	preempt_dynamic_mode
     3624:	94 21 ff f0 	stwu    r1,-16(r1)
     3628:	80 69 00 00 	lwz     r3,0(r9)
			362a: R_PPC_ADDR16_LO	preempt_dynamic_mode
     362c:	2c 03 ff ff 	cmpwi   r3,-1
     3630:	41 82 00 18 	beq     3648 <preempt_model_full+0x28>
     3634:	68 63 00 02 	xori    r3,r3,2
     3638:	38 21 00 10 	addi    r1,r1,16
     363c:	7c 63 00 34 	cntlzw  r3,r3
     3640:	54 63 d9 7e 	srwi    r3,r3,5
     3644:	4e 80 00 20 	blr
...
     3648:	0f e0 00 00 	twui    r0,0
     364c:	68 63 00 02 	xori    r3,r3,2
     3650:	38 21 00 10 	addi    r1,r1,16
     3654:	7c 63 00 34 	cntlzw  r3,r3
     3658:	54 63 d9 7e 	srwi    r3,r3,5
     365c:	4e 80 00 20 	blr

00003660 <preempt_model_lazy>:
     3660:	3d 20 00 00 	lis     r9,0
			3662: R_PPC_ADDR16_HA	preempt_dynamic_mode
     3664:	94 21 ff f0 	stwu    r1,-16(r1)
     3668:	80 69 00 00 	lwz     r3,0(r9)
			366a: R_PPC_ADDR16_LO	preempt_dynamic_mode
     366c:	2c 03 ff ff 	cmpwi   r3,-1
     3670:	41 82 00 18 	beq     3688 <preempt_model_lazy+0x28>
     3674:	68 63 00 03 	xori    r3,r3,3
     3678:	38 21 00 10 	addi    r1,r1,16
     367c:	7c 63 00 34 	cntlzw  r3,r3
     3680:	54 63 d9 7e 	srwi    r3,r3,5
     3684:	4e 80 00 20 	blr
...
     3688:	0f e0 00 00 	twui    r0,0
     368c:	68 63 00 03 	xori    r3,r3,3
     3690:	38 21 00 10 	addi    r1,r1,16
     3694:	7c 63 00 34 	cntlzw  r3,r3
     3698:	54 63 d9 7e 	srwi    r3,r3,5
     369c:	4e 80 00 20 	blr



> But you would still need preempt_model_preemptible() for the !DYN case.
> 
>>>> +	       preempt_model_voluntary() ? "voluntary" :
>>>> +	       preempt_model_full()      ? "full" :
>>>> +	       preempt_model_lazy()      ? "lazy" :
>>>> +	       "",
>>>
>>> So intend to rework this part. I have patches stashed at
>>> 	https://eur01.safelinks.protection.outlook.com/?url=https%3A%2F%2Fgit.kernel.org%2Fpub%2Fscm%2Flinux%2Fkernel%2Fgit%2Fbigeasy%2Fstaging.git%2Flog%2F%3Fh%3Dpreemption_string&data=05%7C02%7Cchristophe.leroy%40csgroup.eu%7C553a8f640a9c4514597d08dd416c6534%7C8b87af7d86474dc78df45f69a2011bb5%7C0%7C0%7C638738655988556429%7CUnknown%7CTWFpbGZsb3d8eyJFbXB0eU1hcGkiOnRydWUsIlYiOiIwLjAuMDAwMCIsIlAiOiJXaW4zMiIsIkFOIjoiTWFpbCIsIldUIjoyfQ%3D%3D%7C0%7C%7C%7C&sdata=ZfVQi1iRYaVCTrZ5vIhOQ7yAKaDnOwFi0NRjycfrI5A%3D&reserved=0
>>>
>>> which I didn't sent yet due to the merge window. Just a heads up ;)
>>
>> Makes sense. I had seen at-least two places where this code was there, ftrace/powerpc.
>> There were way more places..
>>
>> You want me to remove this part?
> 
> No, just be aware.
> I don't know how this will be routed I guess we merge the sched pieces
> first and then I submit the other pieces via the relevant maintainer
> tree. In that case please be aware that all parts get removed/ replaced
> properly.
> 
> Sebastian


Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ