[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Message-ID: <063D6719AE5E284EB5DD2968C1650D6D1CBA42F5@AcuExch.aculab.com>
Date: Mon, 28 Sep 2015 15:31:17 +0000
From: David Laight <David.Laight@...LAB.COM>
To: 'James Bottomley' <James.Bottomley@...senPartnership.com>
CC: "'Rafael J. Wysocki'" <rjw@...ysocki.net>,
Viresh Kumar <viresh.kumar@...aro.org>,
Johannes Berg <johannes@...solutions.net>,
"Greg Kroah-Hartman" <gregkh@...uxfoundation.org>,
Linaro Kernel Mailman List <linaro-kernel@...ts.linaro.org>,
QCA ath9k Development <ath9k-devel@....qualcomm.com>,
Intel Linux Wireless <ilw@...ux.intel.com>,
"linux-doc@...r.kernel.org" <linux-doc@...r.kernel.org>,
"Linux Kernel Mailing List" <linux-kernel@...r.kernel.org>,
"linux-arm-kernel@...ts.infradead.org"
<linux-arm-kernel@...ts.infradead.org>,
Linux ACPI <linux-acpi@...r.kernel.org>,
"open list:BLUETOOTH DRIVERS" <linux-bluetooth@...r.kernel.org>,
"open list:AMD IOMMU (AMD-VI)" <iommu@...ts.linux-foundation.org>,
"netdev@...r.kernel.org" <netdev@...r.kernel.org>,
"open list:NETWORKING DRIVERS (WIRELESS)"
<linux-wireless@...r.kernel.org>,
"open list:TARGET SUBSYSTEM" <linux-scsi@...r.kernel.org>,
"open list:ULTRA-WIDEBAND (UWB) SUBSYSTEM:"
<linux-usb@...r.kernel.org>,
"open list:EDAC-CORE" <linux-edac@...r.kernel.org>,
Linux Memory Management List <linux-mm@...ck.org>,
"moderated list:SOUND - SOC LAYER / DYNAMIC AUDIO POWER MANAGEM..."
<alsa-devel@...a-project.org>
Subject: RE: [PATCH V4 1/2] ACPI / EC: Fix broken 64bit big-endian users of
'global_lock'
From: James Bottomley
> Sent: 28 September 2015 16:12
> > > > The x86 cpus will also do 32bit wide rmw cycles for the 'bit' operations.
> > >
> > > That's different: it's an atomic RMW operation. The problem with the
> > > alpha was that the operation wasn't atomic (meaning that it can't be
> > > interrupted and no intermediate output states are visible).
> >
> > It is only atomic if prefixed by the 'lock' prefix.
> > Normally the read and write are separate bus cycles.
>
> The essential point is that x86 has atomic bit ops and byte writes.
> Early alpha did not.
Early alpha didn't have any byte accesses.
On x86 if you have the following:
struct {
char a;
volatile char b;
} *foo;
foo->a |= 4;
The compiler is likely to generate a 'bis #4, 0(rbx)' (or similar)
and the cpu will do two 32bit memory cycles that read and write
the 'volatile' field 'b'.
(gcc definitely used to do this...)
A lot of fields were made 32bit (and probably not bitfields) in the linux
kernel tree a year or two ago to avoid this very problem.
> > > > You still have to ensure the compiler doesn't do wider rmw cycles.
> > > > I believe the recent versions of gcc won't do wider accesses for volatile data.
> > >
> > > I don't understand this comment. You seem to be implying gcc would do a
> > > 64 bit RMW for a 32 bit store ... that would be daft when a single
> > > instruction exists to perform the operation on all architectures.
> >
> > Read the object code and weep...
> > It is most likely to happen for operations that are rmw (eg bit set).
> > For instance the arm cpu has limited offsets for 16bit accesses, for
> > normal structures the compiler is likely to use a 32bit rmw sequence
> > for a 16bit field that has a large offset.
> > The C language allows the compiler to do it for any access (IIRC including
> > volatiles).
>
> I think you might be confusing different things. Most RISC CPUs can't
> do 32 bit store immediates because there aren't enough bits in their
> arsenal, so they tend to split 32 bit loads into a left and right part
> (first the top then the offset). This (and other things) are mostly
> what you see in code. However, 32 bit register stores are still atomic,
> which is all we require. It's not really the compiler's fault, it's
> mostly an architectural limitation.
No, I'm not talking about how 32bit constants are generated.
I'm talking about structure offsets.
David
Powered by blists - more mailing lists