linux-kernel - RE: [PATCH V4 1/2] ACPI / EC: Fix broken 64bit big-endian users of 'global

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [day] [month] [year] [list]

Message-ID: <063D6719AE5E284EB5DD2968C1650D6D1CBA42F5@AcuExch.aculab.com>
Date:	Mon, 28 Sep 2015 15:31:17 +0000
From:	David Laight <David.Laight@...LAB.COM>
To:	'James Bottomley' <James.Bottomley@...senPartnership.com>
CC:	"'Rafael J. Wysocki'" <rjw@...ysocki.net>,
	Viresh Kumar <viresh.kumar@...aro.org>,
	Johannes Berg <johannes@...solutions.net>,
	"Greg Kroah-Hartman" <gregkh@...uxfoundation.org>,
	Linaro Kernel Mailman List <linaro-kernel@...ts.linaro.org>,
	QCA ath9k Development <ath9k-devel@....qualcomm.com>,
	Intel Linux Wireless <ilw@...ux.intel.com>,
	"linux-doc@...r.kernel.org" <linux-doc@...r.kernel.org>,
	"Linux Kernel Mailing List" <linux-kernel@...r.kernel.org>,
	"linux-arm-kernel@...ts.infradead.org" 
	<linux-arm-kernel@...ts.infradead.org>,
	Linux ACPI <linux-acpi@...r.kernel.org>,
	"open list:BLUETOOTH DRIVERS" <linux-bluetooth@...r.kernel.org>,
	"open list:AMD IOMMU (AMD-VI)" <iommu@...ts.linux-foundation.org>,
	"netdev@...r.kernel.org" <netdev@...r.kernel.org>,
	"open list:NETWORKING DRIVERS (WIRELESS)" 
	<linux-wireless@...r.kernel.org>,
	"open list:TARGET SUBSYSTEM" <linux-scsi@...r.kernel.org>,
	"open list:ULTRA-WIDEBAND (UWB) SUBSYSTEM:" 
	<linux-usb@...r.kernel.org>,
	"open list:EDAC-CORE" <linux-edac@...r.kernel.org>,
	Linux Memory Management List <linux-mm@...ck.org>,
	"moderated list:SOUND - SOC LAYER / DYNAMIC AUDIO POWER MANAGEM..." 
	<alsa-devel@...a-project.org>
Subject: RE: [PATCH V4 1/2] ACPI / EC: Fix broken 64bit big-endian users of
 'global_lock'

From: James Bottomley 
> Sent: 28 September 2015 16:12
> > > > The x86 cpus will also do 32bit wide rmw cycles for the 'bit' operations.
> > >
> > > That's different: it's an atomic RMW operation.  The problem with the
> > > alpha was that the operation wasn't atomic (meaning that it can't be
> > > interrupted and no intermediate output states are visible).
> >
> > It is only atomic if prefixed by the 'lock' prefix.
> > Normally the read and write are separate bus cycles.
> 
> The essential point is that x86 has atomic bit ops and byte writes.
> Early alpha did not.

Early alpha didn't have any byte accesses.

On x86 if you have the following:
	struct {
		char  a;
		volatile char b;
	} *foo;
	foo->a |= 4;

The compiler is likely to generate a 'bis #4, 0(rbx)' (or similar)
and the cpu will do two 32bit memory cycles that read and write
the 'volatile' field 'b'.
(gcc definitely used to do this...)

A lot of fields were made 32bit (and probably not bitfields) in the linux
kernel tree a year or two ago to avoid this very problem.

> > > > You still have to ensure the compiler doesn't do wider rmw cycles.
> > > > I believe the recent versions of gcc won't do wider accesses for volatile data.
> > >
> > > I don't understand this comment.  You seem to be implying gcc would do a
> > > 64 bit RMW for a 32 bit store ... that would be daft when a single
> > > instruction exists to perform the operation on all architectures.
> >
> > Read the object code and weep...
> > It is most likely to happen for operations that are rmw (eg bit set).
> > For instance the arm cpu has limited offsets for 16bit accesses, for
> > normal structures the compiler is likely to use a 32bit rmw sequence
> > for a 16bit field that has a large offset.
> > The C language allows the compiler to do it for any access (IIRC including
> > volatiles).
> 
> I think you might be confusing different things.  Most RISC CPUs can't
> do 32 bit store immediates because there aren't enough bits in their
> arsenal, so they tend to split 32 bit loads into a left and right part
> (first the top then the offset).  This (and other things) are mostly
> what you see in code.  However, 32 bit register stores are still atomic,
> which is all we require.  It's not really the compiler's fault, it's
> mostly an architectural limitation.

No, I'm not talking about how 32bit constants are generated.
I'm talking about structure offsets.

	David