lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Date:	Tue, 3 Jun 2008 23:39:06 -0700 (PDT)
From:	Trent Piepho <tpiepho@...escale.com>
To:	Nick Piggin <nickpiggin@...oo.com.au>
cc:	Matthew Wilcox <matthew@....cx>,
	Russell King <rmk+lkml@....linux.org.uk>,
	Linus Torvalds <torvalds@...ux-foundation.org>,
	Benjamin Herrenschmidt <benh@...nel.crashing.org>,
	David Miller <davem@...emloft.net>, linux-arch@...r.kernel.org,
	scottwood@...escale.com, linuxppc-dev@...abs.org,
	alan@...rguk.ukuu.org.uk, linux-kernel@...r.kernel.org
Subject: Re: MMIO and gcc re-ordering issue

On Wed, 4 Jun 2008, Nick Piggin wrote:
> On Wednesday 04 June 2008 07:44, Trent Piepho wrote:
>> On Tue, 3 Jun 2008, Matthew Wilcox wrote:
>
>>> I don't understand why you keep talking about DMA.  Are you talking
>>> about ordering between readX() and DMA?  PCI proides those guarantees.
>>
>> I guess you haven't been reading the whole thread.  The reason it started
>> was because gcc can re-order powerpc (and everyone else's too) IO accesses
>> vs accesses to cachable memory (but not spin-locks), which ends up only
>> being a problem with coherent DMA.
>
> I don't think it is only a problem with coherent DMA.
>
> CPU0                         CPU1
> mutex_lock(mutex);
> writel(something, DATA_REG);
> writel(GO, CTRL_REG);
> started = 1;
 	(A)
> mutex_unlock(mutex);
>                             mutex_lock(mutex);
 	(B)
>                             if (started)
>                               /* oops, this can reach device before GO */
>                               writel(STOP, CTRL_REG);

The locks themselves should have (and do have) ordering operations to insure
gcc and/or the cpu can't move a store or load outside the locked region. 
Generally you need that to keep stores/loads to cacheable memory inside the
critical area, much less I/O operations.  Otherwise all you have to do is
replace writel(something, ...) with shared_data->something = ...  and there's
an obvious problem.  In your example, gcc currently can and will move the GO
operation to point A (if it can figure out that CTRL_REG and started aren't
aliased), but that's not a problem.  If it could move it to B that would be a
problem, but it can't.

Other than coherent DMA, I don't think there is any reason to care if I/O
accessors are strongly ordered wrt load/stores to cacheable memory.  locking
and streaming DMA sync operations already need to have ordering, so they don't
require all I/O to be ordered wrt all cacheable memory.
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ