[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <5601C0CC.6090503@c-s.fr>
Date: Tue, 22 Sep 2015 22:57:48 +0200
From: Christophe Leroy <christophe.leroy@....fr>
To: Joakim Tjernlund <joakim.tjernlund@...nsmode.se>,
"scottwood@...escale.com" <scottwood@...escale.com>
Cc: "paulus@...ba.org" <paulus@...ba.org>,
"mpe@...erman.id.au" <mpe@...erman.id.au>,
"benh@...nel.crashing.org" <benh@...nel.crashing.org>,
"linux-kernel@...r.kernel.org" <linux-kernel@...r.kernel.org>,
"linuxppc-dev@...ts.ozlabs.org" <linuxppc-dev@...ts.ozlabs.org>
Subject: Re: [PATCH v2 22/25] powerpc32: move xxxxx_dcache_range() functions
inline
Le 22/09/2015 22:38, Joakim Tjernlund a écrit :
> On Tue, 2015-09-22 at 15:35 -0500, Scott Wood wrote:
>> On Tue, 2015-09-22 at 20:32 +0000, Joakim Tjernlund wrote:
>>> On Tue, 2015-09-22 at 15:14 -0500, Scott Wood wrote:
>>>> On Tue, 2015-09-22 at 19:55 +0000, Joakim Tjernlund wrote:
>>>>> On Tue, 2015-09-22 at 14:42 -0500, Scott Wood wrote:
>>>>>> On Tue, 2015-09-22 at 19:34 +0000, Joakim Tjernlund wrote:
>>>>>>> On Tue, 2015-09-22 at 13:58 -0500, Scott Wood wrote:
>>>>>>>> On Tue, 2015-09-22 at 18:12 +0000, Joakim Tjernlund wrote:
>>>>>>>>> On Tue, 2015-09-22 at 18:51 +0200, Christophe Leroy wrote:
>>>>>>>>>> flush/clean/invalidate _dcache_range() functions are all very
>>>>>>>>>> similar and are quite short. They are mainly used in
>>>>>>>>>> __dma_sync()
>>>>>>>>>> perf_event locate them in the top 3 consumming functions
>>>>>>>>>> during
>>>>>>>>>> heavy ethernet activity
>>>>>>>>>>
>>>>>>>>>> They are good candidate for inlining, as __dma_sync() does
>>>>>>>>>> almost nothing but calling them
>>>>>>>>>>
>>>>>>>>>> Signed-off-by: Christophe Leroy <christophe.leroy@....fr>
>>>>>>>>>> ---
>>>>>>>>>> New in v2
>>>>>>>>>>
>>>>>>>>>> arch/powerpc/include/asm/cacheflush.h | 55
>>>>>>>>>> +++++++++++++++++++++++++++--
>>>>>>>>>> arch/powerpc/kernel/misc_32.S | 65 ------------------
>>>>>>>>>> ----
>>>>>>>>>> ----
>>>>>>>>>> ----
>>>>>>>>>> -----
>>>>>>>>>> arch/powerpc/kernel/ppc_ksyms.c | 2 ++
>>>>>>>>>> 3 files changed, 54 insertions(+), 68 deletions(-)
>>>>>>>>>>
>>>>>>>>>> diff --git a/arch/powerpc/include/asm/cacheflush.h
>>>>>>>>>> b/arch/powerpc/include/asm/cacheflush.h
>>>>>>>>>> index 6229e6b..6169604 100644
>>>>>>>>>> --- a/arch/powerpc/include/asm/cacheflush.h
>>>>>>>>>> +++ b/arch/powerpc/include/asm/cacheflush.h
>>>>>>>>>> @@ -47,12 +47,61 @@ static inline void
>>>>>>>>>> __flush_dcache_icache_phys(unsigned long physaddr)
>>>>>>>>>> }
>>>>>>>>>> #endif
>>>>>>>>>>
>>>>>>>>>> -extern void flush_dcache_range(unsigned long start, unsigned
>>>>>>>>>> long
>>>>>>>>>> stop);
>>>>>>>>>> #ifdef CONFIG_PPC32
>>>>>>>>>> -extern void clean_dcache_range(unsigned long start, unsigned
>>>>>>>>>> long
>>>>>>>>>> stop);
>>>>>>>>>> -extern void invalidate_dcache_range(unsigned long start,
>>>>>>>>>> unsigned
>>>>>>>>>> long
>>>>>>>>>> stop);
>>>>>>>>>> +/*
>>>>>>>>>> + * Write any modified data cache blocks out to memory and
>>>>>>>>>> invalidate
>>>>>>>>>> them.
>>>>>>>>>> + * Does not invalidate the corresponding instruction cache
>>>>>>>>>> blocks.
>>>>>>>>>> + */
>>>>>>>>>> +static inline void flush_dcache_range(unsigned long start,
>>>>>>>>>> unsigned
>>>>>>>>>> long
>>>>>>>>>> stop)
>>>>>>>>>> +{
>>>>>>>>>> + void *addr = (void *)(start & ~(L1_CACHE_BYTES - 1));
>>>>>>>>>> + unsigned int size = stop - (unsigned long)addr +
>>>>>>>>>> (L1_CACHE_BYTES -
>>>>>>>>>> 1);
>>>>>>>>>> + unsigned int i;
>>>>>>>>>> +
>>>>>>>>>> + for (i = 0; i < size >> L1_CACHE_SHIFT; i++, addr +=
>>>>>>>>>> L1_CACHE_BYTES)
>>>>>>>>>> + dcbf(addr);
>>>>>>>>>> + if (i)
>>>>>>>>>> + mb(); /* sync */
>>>>>>>>>> +}
>>>>>>>>> This feels optimized for the uncommon case when there is no
>>>>>>>>> invalidation.
>>>>>>>> If you mean the "if (i)", yes, that looks odd.
>>>>>>> Yes.
>>>>>>>
>>>>>>>>> I THINK it would be better to bail early
>>>>>>>> Bail under what conditions?
>>>>>>> test for "i = 0" and return.
>>>>>> Why bother?
>>>>> I usally find it better to dela with special cases upfront så the rest
>>>>> doesn't need to
>>>>> bother. i=0 is a NOP and it is clearer to show that upfront.
>>>> No, I mean why bother special casing this at all?
>>> I just said why?
>>> You to found the if(i) mb() a bit odd and it took a little time to see why
>>> it was there.
>>> Ahh, you mean just skip the if(i) and have mb() unconditionally?
>> Yes.
>>
>>> That changes the semantics a little from the ASM version but perhaps that
>>> doesn't matter?
>> Adding more barriers than strictly necessary is a performance concern, not a
>> semantic change.
> Semantics :)
>
>> How often are we really calling this function over an empty
>> range?
> Never hopefully, it does not make much sense.
>
>> Not that it matters much one way or another...
> probably not.
>
Here is what I get in asm. First one is with "if (i) mb();". We see gcc
puts a beqlr. This is the form that is closest to what we had in the
former misc_32.S
Second one if with "mb()". Here we get a branch to sync for a useless sync
c000e0ac <my_flush_dcache_range1>:
c000e0ac: 54 63 00 36 rlwinm r3,r3,0,0,27
c000e0b0: 38 84 00 0f addi r4,r4,15
c000e0b4: 7d 23 20 50 subf r9,r3,r4
c000e0b8: 55 29 e1 3f rlwinm. r9,r9,28,4,31
c000e0bc: 4d 82 00 20 beqlr
c000e0c0: 7d 29 03 a6 mtctr r9
c000e0c4: 7c 00 18 6c dcbst 0,r3
c000e0c8: 38 63 00 10 addi r3,r3,16
c000e0cc: 42 00 ff f8 bdnz c000e0c4
<my_flush_dcache_range1+0x18>
c000e0d0: 7c 00 04 ac sync
c000e0d4: 4e 80 00 20 blr
c000e0d8 <my_flush_dcache_range2>:
c000e0d8: 54 63 00 36 rlwinm r3,r3,0,0,27
c000e0dc: 38 84 00 0f addi r4,r4,15
c000e0e0: 7d 23 20 50 subf r9,r3,r4
c000e0e4: 55 29 e1 3f rlwinm. r9,r9,28,4,31
c000e0e8: 41 82 00 14 beq c000e0fc
<my_flush_dcache_range2+0x24>
c000e0ec: 7d 29 03 a6 mtctr r9
c000e0f0: 7c 00 18 6c dcbst 0,r3
c000e0f4: 38 63 00 10 addi r3,r3,16
c000e0f8: 42 00 ff f8 bdnz c000e0f0
<my_flush_dcache_range2+0x18>
c000e0fc: 7c 00 04 ac sync
c000e100: 4e 80 00 20 blr
Christophe
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
Powered by blists - more mailing lists