[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20191121185303.GB199273@romley-ivt3.sc.intel.com>
Date: Thu, 21 Nov 2019 10:53:03 -0800
From: Fenghua Yu <fenghua.yu@...el.com>
To: Andy Lutomirski <luto@...nel.org>
Cc: David Laight <David.Laight@...lab.com>,
Ingo Molnar <mingo@...nel.org>,
Peter Zijlstra <peterz@...radead.org>,
Thomas Gleixner <tglx@...utronix.de>,
Ingo Molnar <mingo@...hat.com>, Borislav Petkov <bp@...en8.de>,
H Peter Anvin <hpa@...or.com>,
Tony Luck <tony.luck@...el.com>,
Ashok Raj <ashok.raj@...el.com>,
Ravi V Shankar <ravi.v.shankar@...el.com>,
linux-kernel <linux-kernel@...r.kernel.org>, x86 <x86@...nel.org>
Subject: Re: [PATCH v10 6/6] x86/split_lock: Enable split lock detection by
kernel parameter
On Thu, Nov 21, 2019 at 09:51:03AM -0800, Andy Lutomirski wrote:
> On Thu, Nov 21, 2019 at 9:43 AM David Laight <David.Laight@...lab.com> wrote:
> >
> > From: Ingo Molnar
> > > Sent: 21 November 2019 17:12
> > > * Peter Zijlstra <peterz@...radead.org> wrote:
> > ...
> > > > This feature MUST be default enabled, otherwise everything will
> > > > be/remain broken and we'll end up in the situation where you can't use
> > > > it even if you wanted to.
> > >
> > > Agreed.
> >
> > Before it can be enabled by default someone needs to go through the
> > kernel and fix all the code that abuses the 'bit' functions by using them
> > on int[] instead of long[].
> >
> > I've only seen one fix go through for one use case of one piece of code
> > that repeatedly uses potentially misaligned int[] arrays for bitmasks.
> >
>
> Can we really not just change the lock asm to use 32-bit accesses for
> set_bit(), etc? Sure, it will fail if the bit index is greater than
> 2^32, but that seems nuts.
>
> (Why the *hell* do the bitops use long anyway? They're *bit masks*
> for crying out loud. As in, users generally want to operate on fixed
> numbers of bits.)
We are working on a separate patch set to fix all split lock issues
in atomic bitops. Per Peter Anvin and Tony Luck suggestions:
1. Still keep the byte optimization if nr is constant. No split lock.
2. If type of *addr is unsigned long, do quadword atomic instruction
on addr. No split lock.
3. If type of *addr is unsigned int, do word atomic instruction
on addr. No split lock.
4. Otherwise, re-calculate addr to point the 32-bit address which contains
the bit and operate on the bit. No split lock.
Only small percentage of atomic bitops calls are in case 4 (e.g. 3%
for set_bit()) which need a few extra instructions to re-calculate
address but can avoid big split lock overhead.
To get real type of *addr instead of type cast type "unsigned long",
the atomic bitops APIs are changed to macros from functions. This change
need to touch all architectures.
Thanks.
-Fenghua
Powered by blists - more mailing lists