lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <47d45f33-d5aa-b4b5-9b5f-2e86e309a206@rasmusvillemoes.dk>
Date:   Wed, 23 Oct 2019 09:08:28 +0200
From:   Rasmus Villemoes <linux@...musvillemoes.dk>
To:     Christophe Leroy <christophe.leroy@....fr>,
        Qiang Zhao <qiang.zhao@....com>, Li Yang <leoyang.li@....com>
Cc:     linuxppc-dev@...ts.ozlabs.org, linux-kernel@...r.kernel.org,
        linux-arm-kernel@...ts.infradead.org
Subject: Re: [PATCH 3/7] soc: fsl: qe: avoid ppc-specific io accessors

On 22/10/2019 17.01, Christophe Leroy wrote:
> 
> 
> On 10/18/2019 12:52 PM, Rasmus Villemoes wrote:
>> In preparation for allowing to build QE support for architectures
>> other than PPC, replace the ppc-specific io accessors. Done via
>>
> 
> This patch is not transparent in terms of performance, functions get
> changed significantly.
> 
> Before the patch:
> 
> 00000330 <ucc_fast_enable>:
>  330:    81 43 00 04     lwz     r10,4(r3)
>  334:    7c 00 04 ac     hwsync
>  338:    81 2a 00 00     lwz     r9,0(r10)
>  33c:    0c 09 00 00     twi     0,r9,0
>  340:    4c 00 01 2c     isync
>  344:    70 88 00 02     andi.   r8,r4,2
>  348:    41 82 00 10     beq     358 <ucc_fast_enable+0x28>
>  34c:    39 00 00 01     li      r8,1
>  350:    91 03 00 10     stw     r8,16(r3)
>  354:    61 29 00 10     ori     r9,r9,16
>  358:    70 88 00 01     andi.   r8,r4,1
>  35c:    41 82 00 10     beq     36c <ucc_fast_enable+0x3c>
>  360:    39 00 00 01     li      r8,1
>  364:    91 03 00 14     stw     r8,20(r3)
>  368:    61 29 00 20     ori     r9,r9,32
>  36c:    7c 00 04 ac     hwsync
>  370:    91 2a 00 00     stw     r9,0(r10)
>  374:    4e 80 00 20     blr
> 
> After the patch:
> 
> 0000030c <ucc_fast_enable>:
>  30c:    94 21 ff e0     stwu    r1,-32(r1)
>  310:    7c 08 02 a6     mflr    r0
>  314:    bf a1 00 14     stmw    r29,20(r1)
>  318:    7c 9f 23 78     mr      r31,r4
>  31c:    90 01 00 24     stw     r0,36(r1)
>  320:    7c 7e 1b 78     mr      r30,r3
>  324:    83 a3 00 04     lwz     r29,4(r3)
>  328:    7f a3 eb 78     mr      r3,r29
>  32c:    48 00 00 01     bl      32c <ucc_fast_enable+0x20>
>             32c: R_PPC_REL24    ioread32be
>  330:    73 e9 00 02     andi.   r9,r31,2
>  334:    41 82 00 10     beq     344 <ucc_fast_enable+0x38>
>  338:    39 20 00 01     li      r9,1
>  33c:    91 3e 00 10     stw     r9,16(r30)
>  340:    60 63 00 10     ori     r3,r3,16
>  344:    73 e9 00 01     andi.   r9,r31,1
>  348:    41 82 00 10     beq     358 <ucc_fast_enable+0x4c>
>  34c:    39 20 00 01     li      r9,1
>  350:    91 3e 00 14     stw     r9,20(r30)
>  354:    60 63 00 20     ori     r3,r3,32
>  358:    80 01 00 24     lwz     r0,36(r1)
>  35c:    7f a4 eb 78     mr      r4,r29
>  360:    bb a1 00 14     lmw     r29,20(r1)
>  364:    7c 08 03 a6     mtlr    r0
>  368:    38 21 00 20     addi    r1,r1,32
>  36c:    48 00 00 00     b       36c <ucc_fast_enable+0x60>
>             36c: R_PPC_REL24    iowrite32be

True. Do you know why powerpc uses out-of-line versions of these
accessors when !PPC_INDIRECT_PIO, i.e. at least all of PPC32? It's quite
a bit beyond the scope of this series, but I'd expect moving most if not
all of arch/powerpc/kernel/iomap.c into asm/io.h (guarded by
!defined(CONFIG_PPC_INDIRECT_PIO) of course) as static inlines would
benefit all ppc32 users of iowrite32 and friends.

Is there some other primitive available that (a) is defined on all
architectures (or at least both ppc and arm) and (b) expands to good
code in both/all cases?

Note that a few uses of the the iowrite32be accessors has already
appeared in the qe code with the introduction of the qe_clrsetbits()
helpers in bb8b2062af.

Rasmus

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ