[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <a4b3b3a260b94bfdb46a4b5d57b36f01@AcuMS.aculab.com>
Date: Fri, 3 May 2024 16:16:21 +0000
From: David Laight <David.Laight@...LAB.COM>
To: 'Waiman Long' <longman@...hat.com>, "'linux-kernel@...r.kernel.org'"
<linux-kernel@...r.kernel.org>, "'peterz@...radead.org'"
<peterz@...radead.org>
CC: "'mingo@...hat.com'" <mingo@...hat.com>, "'will@...nel.org'"
<will@...nel.org>, "'boqun.feng@...il.com'" <boqun.feng@...il.com>, "'Linus
Torvalds'" <torvalds@...ux-foundation.org>,
"'virtualization@...ts.linux-foundation.org'"
<virtualization@...ts.linux-foundation.org>, 'Zeng Heng'
<zengheng4@...wei.com>
Subject: RE: [PATCH next v2 5/5] locking/osq_lock: Optimise decode_cpu() and
per_cpu_ptr().
From: Waiman Long
> Sent: 03 May 2024 17:00
> To: David Laight <David.Laight@...LAB.COM>; 'linux-kernel@...r.kernel.org' <linux-
> kernel@...r.kernel.org>; 'peterz@...radead.org' <peterz@...radead.org>
> Cc: 'mingo@...hat.com' <mingo@...hat.com>; 'will@...nel.org' <will@...nel.org>; 'boqun.feng@...il.com'
> <boqun.feng@...il.com>; 'Linus Torvalds' <torvalds@...ux-foundation.org>; 'virtualization@...ts.linux-
> foundation.org' <virtualization@...ts.linux-foundation.org>; 'Zeng Heng' <zengheng4@...wei.com>
> Subject: Re: [PATCH next v2 5/5] locking/osq_lock: Optimise decode_cpu() and per_cpu_ptr().
>
>
> On 12/31/23 23:14, Waiman Long wrote:
> >
> > On 12/31/23 16:55, David Laight wrote:
> >> per_cpu_ptr() indexes __per_cpu_offset[] with the cpu number.
> >> This requires the cpu number be 64bit.
> >> However the value is osq_lock() comes from a 32bit xchg() and there
> >> isn't a way of telling gcc the high bits are zero (they are) so
> >> there will always be an instruction to clear the high bits.
> >>
> >> The cpu number is also offset by one (to make the initialiser 0)
> >> It seems to be impossible to get gcc to convert
> >> __per_cpu_offset[cpu_p1 - 1]
> >> into (__per_cpu_offset - 1)[cpu_p1] (transferring the offset to the
> >> address).
> >>
> >> Converting the cpu number to 32bit unsigned prior to the decrement means
> >> that gcc knows the decrement has set the high bits to zero and doesn't
> >> add a register-register move (or cltq) to zero/sign extend the value.
> >>
> >> Not massive but saves two instructions.
> >>
> >> Signed-off-by: David Laight <david.laight@...lab.com>
> >> ---
> >> kernel/locking/osq_lock.c | 6 ++----
> >> 1 file changed, 2 insertions(+), 4 deletions(-)
> >>
> >> diff --git a/kernel/locking/osq_lock.c b/kernel/locking/osq_lock.c
> >> index 35bb99e96697..37a4fa872989 100644
> >> --- a/kernel/locking/osq_lock.c
> >> +++ b/kernel/locking/osq_lock.c
> >> @@ -29,11 +29,9 @@ static inline int encode_cpu(int cpu_nr)
> >> return cpu_nr + 1;
> >> }
> >> -static inline struct optimistic_spin_node *decode_cpu(int
> >> encoded_cpu_val)
> >> +static inline struct optimistic_spin_node *decode_cpu(unsigned int
> >> encoded_cpu_val)
> >> {
> >> - int cpu_nr = encoded_cpu_val - 1;
> >> -
> >> - return per_cpu_ptr(&osq_node, cpu_nr);
> >> + return per_cpu_ptr(&osq_node, encoded_cpu_val - 1);
> >> }
> >> /*
> >
> > You really like micro-optimization.
> >
> > Anyway,
> >
> > Reviewed-by: Waiman Long <longman@...hat.com>
> >
> David,
>
> Could you respin the series based on the latest upstream code?
Looks like a wet bank holiday weekend.....
David
-
Registered Address Lakeside, Bramley Road, Mount Farm, Milton Keynes, MK1 1PT, UK
Registration No: 1397386 (Wales)
Powered by blists - more mailing lists