[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-Id: <381baa08d36bc8a8f7051ec8bdbe9620e83f925d.camel@au1.ibm.com>
Date: Thu, 22 Aug 2019 15:47:54 +1000
From: "Alastair D'Silva" <alastair@....ibm.com>
To: Christophe Leroy <christophe.leroy@....fr>,
Segher Boessenkool <segher@...nel.crashing.org>,
Michael Ellerman <mpe@...erman.id.au>
Cc: linux-kernel@...r.kernel.org, linuxppc-dev@...ts.ozlabs.org
Subject: RE: [RFC PATCH] powerpc: Convert ____flush_dcache_icache_phys() to C
On Thu, 2019-08-22 at 07:06 +0200, Christophe Leroy wrote:
>
> Le 22/08/2019 à 02:27, Alastair D'Silva a écrit :
> > On Wed, 2019-08-21 at 22:27 +0200, Christophe Leroy wrote:
> > > Le 20/08/2019 à 06:36, Alastair D'Silva a écrit :
> > > > On Fri, 2019-08-16 at 15:52 +0000, Christophe Leroy wrote:
> > >
> > > [...]
> > >
> > > > Thanks Christophe,
> > > >
> > > > I'm trying a somewhat different approach that requires less
> > > > knowledge
> > > > of assembler. Handling of CPU_FTR_COHERENT_ICACHE is outside
> > > > this
> > > > function. The code below is not a patch as my tree is a bit
> > > > messy,
> > > > sorry:
> > >
> > > Can we be 100% sure that GCC won't add any code accessing some
> > > global
> > > data or stack while the Data MMU is OFF ?
> > >
> > > Christophe
> > >
> >
> > +mpe
> >
> > I'm not sure how we would go about making such a guarantee, but
> > I've
> > tied every variable used to a register and addr is passed in a
> > register, so there is no stack usage, and every call in there only
> > operates on it's operands.
> >
> > The calls to the inline cache helpers (for the PPC32 case) are all
> > constants, so I can't see a reasonable scenario where there would
> > be a
> > function call and reordered to after the DR bit is turned off, but
> > I
> > guess if we want to be paranoid, we could always add an mb() call
> > before the DR bit is manipulated to prevent the compiler from
> > reordering across the section where the data MMU is disabled.
> >
> >
>
> Anyway, I think the benefit of converting that function to C is
> pretty
> small. flush_dcache_range() and friends were converted to C mainly
> in
> order to inline them. But this __flush_dcache_icache_phys() is too
> big
> to be worth inlining, yet small and stable enough to remain in
> assembly
> for the time being.
>
I disagree on this point, after converting it to C, using
44x/currituck.defconfig, the compiler definitely will inline it (noting
that there is only 1 caller of it):
00000134 <flush_dcache_icache_page>:
134: 94 21 ff f0 stwu r1,-16(r1)
138: 3d 20 00 00 lis r9,0
13c: 81 29 00 00 lwz r9,0(r9)
140: 7c 08 02 a6 mflr r0
144: 38 81 00 0c addi r4,r1,12
148: 90 01 00 14 stw r0,20(r1)
14c: 91 21 00 0c stw r9,12(r1)
150: 48 00 00 01 bl 150 <flush_dcache_icache_page+0x1c>
154: 39 00 00 20 li r8,32
158: 39 43 10 00 addi r10,r3,4096
15c: 7c 69 1b 78 mr r9,r3
160: 7d 09 03 a6 mtctr r8
164: 7c 00 48 6c dcbst 0,r9
168: 39 29 00 80 addi r9,r9,128
16c: 42 00 ff f8 bdnz 164 <flush_dcache_icache_page+0x30>
170: 7c 00 04 ac hwsync
174: 7c 69 1b 78 mr r9,r3
178: 7c 00 4f ac icbi 0,r9
17c: 39 29 00 80 addi r9,r9,128
180: 7f 8a 48 40 cmplw cr7,r10,r9
184: 40 9e ff f4 bne cr7,178 <flush_dcache_icache_page+0x44>
188: 7c 00 04 ac hwsync
18c: 4c 00 01 2c isync
190: 80 01 00 14 lwz r0,20(r1)
194: 38 21 00 10 addi r1,r1,16
198: 7c 08 03 a6 mtlr r0
19c: 48 00 00 00 b 19c <flush_dcache_icache_page+0x68>
> So I suggest you keep it aside your series for now, just move
> PURGE_PREFETCHED_INS inside it directly as it will be the only
> remaining
> user of it.
>
> Christophe
--
Alastair D'Silva
Open Source Developer
Linux Technology Centre, IBM Australia
mob: 0423 762 819
Powered by blists - more mailing lists