lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Date:   Tue, 27 Sep 2022 08:47:29 +0000
From:   Christophe Leroy <christophe.leroy@...roup.eu>
To:     Yajun Deng <yajun.deng@...ux.dev>,
        "akpm@...ux-foundation.org" <akpm@...ux-foundation.org>
CC:     "linux-mm@...ck.org" <linux-mm@...ck.org>,
        "linux-kernel@...r.kernel.org" <linux-kernel@...r.kernel.org>
Subject: Re: [PATCH] mm/early_ioremap: Combine two loops to improve
 performance



Le 27/09/2022 à 09:52, Yajun Deng a écrit :
> The first loop will waring once if prev_map is init, we can add a
> boolean variable to do that. So those two loops can be combined to
> improve performance.

Do you have evidence of the improved performance ?

Looking at the generated code, I have the fealing that the performance 
is reduced by the !init_prev_map that is checked at every lap.

Before the patch:

00000250 <early_ioremap_setup>:
  250:	3d 20 00 00 	lis     r9,0
			252: R_PPC_ADDR16_HA	.init.data
  254:	39 29 00 00 	addi    r9,r9,0
			256: R_PPC_ADDR16_LO	.init.data
  258:	38 e0 00 10 	li      r7,16
  25c:	39 09 00 04 	addi    r8,r9,4
  260:	39 40 00 00 	li      r10,0
  264:	7c e9 03 a6 	mtctr   r7

---- First loop : ----
  268:	55 47 10 3a 	rlwinm  r7,r10,2,0,29
  26c:	7c e7 40 2e 	lwzx    r7,r7,r8
  270:	2c 07 00 00 	cmpwi   r7,0
  274:	41 a2 00 08 	beq     27c <early_ioremap_setup+0x2c>
  278:	0f e0 00 00 	twui    r0,0
  27c:	39 4a 00 01 	addi    r10,r10,1
  280:	42 00 ff e8 	bdnz    268 <early_ioremap_setup+0x18>

  284:	39 00 00 10 	li      r8,16
  288:	39 29 00 84 	addi    r9,r9,132
  28c:	3d 40 ff b0 	lis     r10,-80
  290:	7d 09 03 a6 	mtctr   r8

---- Second loop : ----
  294:	95 49 00 04 	stwu    r10,4(r9)
  298:	3d 4a 00 04 	addis   r10,r10,4
  29c:	42 00 ff f8 	bdnz    294 <early_ioremap_setup+0x44>

  2a0:	4e 80 00 20 	blr

After the patch:

00000250 <early_ioremap_setup>:
  250:	3d 20 00 00 	lis     r9,0
			252: R_PPC_ADDR16_HA	.init.data
  254:	39 29 00 00 	addi    r9,r9,0
			256: R_PPC_ADDR16_LO	.init.data
  258:	39 00 00 10 	li      r8,16
  25c:	38 c9 00 04 	addi    r6,r9,4
  260:	39 40 00 00 	li      r10,0
  264:	39 29 00 88 	addi    r9,r9,136
  268:	38 e0 00 00 	li      r7,0
  26c:	7d 09 03 a6 	mtctr   r8

--- Loop ---
  270:	70 e8 00 01 	andi.   r8,r7,1
  274:	40 82 00 18 	bne     28c <early_ioremap_setup+0x3c>
  278:	7d 0a 30 2e 	lwzx    r8,r10,r6
  27c:	2c 08 00 00 	cmpwi   r8,0
  280:	41 a2 00 0c 	beq     28c <early_ioremap_setup+0x3c>
  284:	0f e0 00 00 	twui    r0,0
  288:	38 e0 00 01 	li      r7,1
  28c:	55 48 80 1e 	rlwinm  r8,r10,16,0,15
  290:	3d 08 ff b0 	addis   r8,r8,-80
  294:	7d 0a 49 2e 	stwx    r8,r10,r9
  298:	39 4a 00 04 	addi    r10,r10,4
  29c:	42 00 ff d4 	bdnz    270 <early_ioremap_setup+0x20>

  2a0:	4e 80 00 20 	blr


Maybe using WARN_ON_ONCE() could be the solution. But looking at the 
code generated if using it, I still have the feeling that GCC has a 
better code with the original code.


> 
> Signed-off-by: Yajun Deng <yajun.deng@...ux.dev>
> ---
>   mm/early_ioremap.c | 9 +++++----
>   1 file changed, 5 insertions(+), 4 deletions(-)
> 
> diff --git a/mm/early_ioremap.c b/mm/early_ioremap.c
> index 9bc12e526ed0..3076fb47c685 100644
> --- a/mm/early_ioremap.c
> +++ b/mm/early_ioremap.c
> @@ -70,14 +70,15 @@ static unsigned long slot_virt[FIX_BTMAPS_SLOTS] __initdata;
>   
>   void __init early_ioremap_setup(void)
>   {
> +	bool init_prev_map = false;
>   	int i;
>   
> -	for (i = 0; i < FIX_BTMAPS_SLOTS; i++)
> -		if (WARN_ON(prev_map[i]))
> -			break;
> +	for (i = 0; i < FIX_BTMAPS_SLOTS; i++) {
> +		if (!init_prev_map && WARN_ON(prev_map[i]))
> +			init_prev_map = true;
>   
> -	for (i = 0; i < FIX_BTMAPS_SLOTS; i++)
>   		slot_virt[i] = __fix_to_virt(FIX_BTMAP_BEGIN - NR_FIX_BTMAPS*i);
> +	}
>   }
>   
>   static int __init check_early_ioremap_leak(void)

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ