linux-kernel - Re: [BUG FIX] Make x86_32 uni-processor Atomic ops, Atomic

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <4A173580.9050304@zytor.com>
Date:	Fri, 22 May 2009 16:30:08 -0700
From:	"H. Peter Anvin" <hpa@...or.com>
To:	lkml@...ethan.org
CC:	Ingo Molnar <mingo@...e.hu>, Thomas Gleixner <tglx@...utronix.de>,
	linux-kernel@...r.kernel.org
Subject: Re: [BUG FIX] Make x86_32 uni-processor Atomic ops, Atomic

If there is a driver which relies on locked operations to be atomic with
respect to the I/O subsystem, it needs to use true locks, not LOCK_PREFIX.

An interrupt cannot interrupt between two parts of a lockable
instruction even if it isn't locked (there are non-atomic instructions
in the x86 architecture, but they can never be locked.)

The other thing that you might be seeing is that a locked operation may
be slow enough to keep an otherwise-present race condition from being
triggered.

> That tells us nothing, since the CPU technical details are under NDA.

Have you considered that you might be running into a CPU bug or design
error?  There was the out-of-order store bug on the Winchip that needed
workarounds (CONFIG_X86_OOSTORE) that I don't think were ever well
tested and might very well have bitrotted?

> All that can be done in this case is report behavior differences from
> the closest publicly described processor (Pentium-M).
> 
> For that purpose, I suggest that a single processor box, with other 
> hardware that makes memory access independent of the processor's 
> control using a processor older than P-4 is a potential test bed.
> "Other hardware that makes memory access..." I previously termed:
> "buss master DMA" - which is overly specific.  It misleads people
> into thinking I am seeing hardware control issues rather than
> non-exclusive memory access.
> 
> My earlier comments about taking an interrupt between the memory read
> and the memory write operations is from a different manual than the
> one posted.  A manual that only applies to processors older than 
> the ones supported by the Linux kernel.  
> Sorry, my bad, grabbed the wrong book, posted the correct link (SH).
> 
> Until one or more specific usages of the LOCK_PREFIX macro can be 
> demonstrated to be incorrect (at least for some of the processors 
> using this code) - -
> 
> Then making the posted change is a single point change that gives a 
> pair of builds (one with, one without) to compare the behavior of on 
> the test bed.
> 
> It is *not* the preferred change for a general release kernel, the
> preferred change would be one that makes a specific rather than
> general correction.  
> Perhaps only for some functions, perhaps only for some of the 
> processors that currently select this code.
> 
> The observation that executing an unnecessary 'lock' opcode in some
> cases slows down the machine is not felt by myself to be significant 
> to duplicating my observations.  Note: I have been wrong before.

What makes you draw that conclusion, in particular?  A lock prefix
typically slows down the following instruction dramatically, on some
processors by many hundreds of cycles.

> This is as informative as I can make the message.
> 
> PS: *not* a single machine failure, tested on five machines, owned
> by four different people, two brands, with different use histories.

What do they have in common?

	-hpa
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/