[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <4759f5e9-24a6-7710-86a0-c8e45f5decb7@c-s.fr>
Date: Sat, 28 Mar 2020 10:19:52 +0000
From: Christophe Leroy <christophe.leroy@....fr>
To: Leonardo Bras <leonardo@...ux.ibm.com>,
Peter Zijlstra <peterz@...radead.org>,
Ingo Molnar <mingo@...hat.com>, Will Deacon <will@...nel.org>,
Benjamin Herrenschmidt <benh@...nel.crashing.org>,
Paul Mackerras <paulus@...ba.org>,
Michael Ellerman <mpe@...erman.id.au>,
Enrico Weigelt <info@...ux.net>,
Allison Randal <allison@...utok.net>,
Thomas Gleixner <tglx@...utronix.de>
Cc: linuxppc-dev@...ts.ozlabs.org, linux-kernel@...r.kernel.org
Subject: Re: [PATCH 1/1] ppc/crash: Skip spinlocks during crash
Hi Leonardo,
On 03/27/2020 03:51 PM, Leonardo Bras wrote:
> Hello Christophe, thanks for the feedback.
>
> I noticed an error in this patch and sent a v2, that can be seen here:
> http://patchwork.ozlabs.org/patch/1262468/
>
> Comments inline::
>
> On Fri, 2020-03-27 at 07:50 +0100, Christophe Leroy wrote:
>>> @@ -142,6 +144,8 @@ static inline void arch_spin_lock(arch_spinlock_t *lock)
>>> if (likely(__arch_spin_trylock(lock) == 0))
>>> break;
>>> do {
>>> + if (unlikely(crash_skip_spinlock))
>>> + return;
>
> Complete function for reference:
> static inline void arch_spin_lock(arch_spinlock_t *lock)
> {
> while (1) {
> if (likely(__arch_spin_trylock(lock) == 0))
> break;
> do {
> if (unlikely(crash_skip_spinlock))
> return;
> HMT_low();
> if (is_shared_processor())
> splpar_spin_yield(lock);
> } while (unlikely(lock->slock != 0));
> HMT_medium();
> }
> }
>
>> You are adding a test that reads a global var in the middle of a so hot
>> path ? That must kill performance.
>
> I thought it would, in worst case scenario, increase a maximum delay of
> an arch_spin_lock() call 1 spin cycle. Here is what I thought:
>
> - If the lock is already free, it would change nothing,
> - Otherwise, the lock will wait.
> - Waiting cycle just got bigger.
> - Worst case scenario: running one more cycle, given lock->slock can
> turn to 0 just after checking.
>
> Could you please point where I failed to see the performance penalty?
> (I need to get better at this :) )
You are right that when the lock is free, it changes nothing. However
when it is not, it is not just one cycle.
Here is arch_spin_lock() without your patch:
00000440 <my_lock>:
440: 39 40 00 01 li r10,1
444: 7d 20 18 28 lwarx r9,0,r3
448: 2c 09 00 00 cmpwi r9,0
44c: 40 82 00 10 bne 45c <my_lock+0x1c>
450: 7d 40 19 2d stwcx. r10,0,r3
454: 40 a2 ff f0 bne 444 <my_lock+0x4>
458: 4c 00 01 2c isync
45c: 2f 89 00 00 cmpwi cr7,r9,0
460: 4d be 00 20 bclr+ 12,4*cr7+eq
464: 7c 21 0b 78 mr r1,r1
468: 81 23 00 00 lwz r9,0(r3)
46c: 2f 89 00 00 cmpwi cr7,r9,0
470: 40 be ff f4 bne cr7,464 <my_lock+0x24>
474: 7c 42 13 78 mr r2,r2
478: 7d 20 18 28 lwarx r9,0,r3
47c: 2c 09 00 00 cmpwi r9,0
480: 40 82 00 10 bne 490 <my_lock+0x50>
484: 7d 40 19 2d stwcx. r10,0,r3
488: 40 a2 ff f0 bne 478 <my_lock+0x38>
48c: 4c 00 01 2c isync
490: 2f 89 00 00 cmpwi cr7,r9,0
494: 40 be ff d0 bne cr7,464 <my_lock+0x24>
498: 4e 80 00 20 blr
Here is arch_spin_lock() with your patch. I enclose with === what comes
in addition:
00000440 <my_lock>:
440: 39 40 00 01 li r10,1
444: 7d 20 18 28 lwarx r9,0,r3
448: 2c 09 00 00 cmpwi r9,0
44c: 40 82 00 10 bne 45c <my_lock+0x1c>
450: 7d 40 19 2d stwcx. r10,0,r3
454: 40 a2 ff f0 bne 444 <my_lock+0x4>
458: 4c 00 01 2c isync
45c: 2f 89 00 00 cmpwi cr7,r9,0
460: 4d be 00 20 bclr+ 12,4*cr7+eq
=====================================================
464: 3d 40 00 00 lis r10,0
466: R_PPC_ADDR16_HA crash_skip_spinlock
468: 39 4a 00 00 addi r10,r10,0
46a: R_PPC_ADDR16_LO crash_skip_spinlock
46c: 39 00 00 01 li r8,1
470: 89 2a 00 00 lbz r9,0(r10)
474: 2f 89 00 00 cmpwi cr7,r9,0
478: 4c 9e 00 20 bnelr cr7
=====================================================
47c: 7c 21 0b 78 mr r1,r1
480: 81 23 00 00 lwz r9,0(r3)
484: 2f 89 00 00 cmpwi cr7,r9,0
488: 40 be ff f4 bne cr7,47c <my_lock+0x3c>
48c: 7c 42 13 78 mr r2,r2
490: 7d 20 18 28 lwarx r9,0,r3
494: 2c 09 00 00 cmpwi r9,0
498: 40 82 00 10 bne 4a8 <my_lock+0x68>
49c: 7d 00 19 2d stwcx. r8,0,r3
4a0: 40 a2 ff f0 bne 490 <my_lock+0x50>
4a4: 4c 00 01 2c isync
4a8: 2f 89 00 00 cmpwi cr7,r9,0
4ac: 40 be ff c4 bne cr7,470 <my_lock+0x30>
4b0: 4e 80 00 20 blr
Christophe
Powered by blists - more mailing lists