lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <Pine.LNX.4.44L0.1609051105440.25234-100000@netrider.rowland.org>
Date:   Mon, 5 Sep 2016 11:29:26 -0400 (EDT)
From:   Alan Stern <stern@...land.harvard.edu>
To:     Peter Zijlstra <peterz@...radead.org>
cc:     "Paul E. McKenney" <paulmck@...ux.vnet.ibm.com>,
        Ingo Molnar <mingo@...hat.com>,
        Felipe Balbi <felipe.balbi@...ux.intel.com>,
        USB list <linux-usb@...r.kernel.org>,
        Kernel development list <linux-kernel@...r.kernel.org>
Subject: Re: Memory barrier needed with wake_up_process()?

On Mon, 5 Sep 2016, Peter Zijlstra wrote:

> > Actually, seeing it written out like this, one realizes that it really 
> > ought to be:
> > 
> > 	CPU 0				CPU 1
> >         -----				-----
> >         Start DMA			Handle DMA-complete irq
> >         Sleep until bh->state		smp_mb()
> > 					set bh->state
> > 					Wake up CPU 0
> > 	smp_mb()
> > 	Compute rc based on contents of the DMA buffer
> > 
> > (Bear in mind also that on some platforms, the I/O operation is carried 
> > out by PIO rather than DMA.)
> 
> I'm sorry, but I still don't follow. This could be because I seldom
> interact with DMA agents and therefore am not familiar with that stuff.

I haven't seen the details of memory ordering requirements in the
presence of DMA spelled out anywhere.  The documents I have read merely
state that you have to careful to flush caches before doing DMA OUT and
to avoid filling caches before DMA IN is complete.  Neither of those is
an issue here, apparently.

> Also, smp_mb() doesn't necessarily interact with MMIO/DMA at all IIRC.
> Its only defined to do CPU/CPU interactions.

Suppose the DMA master finishes filling up an input buffer and issues a
completion irq to CPU1.  Presumably the data would then be visible to
CPU1 if the interrupt handler looked at it.  So if CPU1 executes smp_mb
before setting bh->state and waking up CPU0, and if CPU0 executes
smp_mb after testing bh->state and before reading the data buffer,
wouldn't CPU0 then see the correct data in the buffer?  Even if CPU0 
never did go to sleep?

Or would something more be needed?

> I would very much expect the device IO stuff to order things for us in
> this case. "Start DMA" should very much include sufficient fences to
> ensure the data under DMA is visible to the DMA engine, this would very
> much include things like flushing store buffers and maybe even writeback
> caches, depending on platform needs.
> 
> At the same time, I would expect "Handle DMA-complete irq", even if it
> were done with a PIO polling loop, to guarantee ordering against later
> operations such that 'complete' really means that.

That's what I would expect too.

Back in the original email thread where the problem was first reported, 
Felipe says that the problem appears to be something else.  Here's what 
it looks like now, in schematic form:

	CPU0
	----
	get_next_command():
	  while (bh->state != BUF_STATE_EMPTY)
		sleep_thread();
	  start an input request for bh
	  while (bh->state != BUF_STATE_FULL)
		sleep_thread();

As mentioned above, the input involves DMA and is terminated by an irq.
The request's completion handler is bulk_out_complete():

	CPU1
	----
	bulk_out_complete():
	  bh->state = BUF_STATE_FULL;
	  wakeup_thread();

According to Felipe, when CPU0 wakes up and checks bh->state, it sees a 
value different from BUF_STATE_FULL.  So it goes back to sleep again 
and doesn't make any forward progress.

It's possible that something else is changing bh->state when it 
shouldn't.  But if this were the explanation, why would Felipe see that 
the problem goes away when he changes the memory barriers in 
sleep_thread() and wakeup_thread() to smp_mb()?

Alan Stern

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ