lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20160905083334.GX10153@twins.programming.kicks-ass.net>
Date:   Mon, 5 Sep 2016 10:33:34 +0200
From:   Peter Zijlstra <peterz@...radead.org>
To:     Alan Stern <stern@...land.harvard.edu>
Cc:     "Paul E. McKenney" <paulmck@...ux.vnet.ibm.com>,
        Ingo Molnar <mingo@...hat.com>,
        Felipe Balbi <felipe.balbi@...ux.intel.com>,
        USB list <linux-usb@...r.kernel.org>,
        Kernel development list <linux-kernel@...r.kernel.org>
Subject: Re: Memory barrier needed with wake_up_process()?

On Sat, Sep 03, 2016 at 10:49:39AM -0400, Alan Stern wrote:
> On Sat, 3 Sep 2016, Alan Stern wrote:
> 
> > In other words, we have:
> > 
> > 	CPU 0				CPU 1
> > 	-----				-----
> > 	Start DMA			Handle DMA-complete irq
> > 	Sleep until bh->state		Set bh->state
> > 					smp_wmb()
> > 					Wake up CPU 0
> > 	smp_rmb()
> > 	Compute rc based on contents
> > 		of the DMA buffer
> > 
> > This was written many years ago, at a time when I did not fully
> > understand all the details of memory ordering.  Do you agree that both
> > of those barriers should really be smp_mb()?  That's what Felipe has
> > been testing.
> 
> Actually, seeing it written out like this, one realizes that it really 
> ought to be:
> 
> 	CPU 0				CPU 1
>         -----				-----
>         Start DMA			Handle DMA-complete irq
>         Sleep until bh->state		smp_mb()
> 					set bh->state
> 					Wake up CPU 0
> 	smp_mb()
> 	Compute rc based on contents of the DMA buffer
> 
> (Bear in mind also that on some platforms, the I/O operation is carried 
> out by PIO rather than DMA.)

I'm sorry, but I still don't follow. This could be because I seldom
interact with DMA agents and therefore am not familiar with that stuff.

Also, smp_mb() doesn't necessarily interact with MMIO/DMA at all IIRC.
Its only defined to do CPU/CPU interactions.

I would very much expect the device IO stuff to order things for us in
this case. "Start DMA" should very much include sufficient fences to
ensure the data under DMA is visible to the DMA engine, this would very
much include things like flushing store buffers and maybe even writeback
caches, depending on platform needs.

At the same time, I would expect "Handle DMA-complete irq", even if it
were done with a PIO polling loop, to guarantee ordering against later
operations such that 'complete' really means that.

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ