lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20131219181002.GA32508@gmail.com>
Date:	Thu, 19 Dec 2013 19:10:02 +0100
From:	Ingo Molnar <mingo@...nel.org>
To:	Peter Zijlstra <peterz@...radead.org>
Cc:	"H. Peter Anvin" <hpa@...or.com>, Len Brown <lenb@...nel.org>,
	x86@...nel.org, linux-pm@...r.kernel.org,
	linux-kernel@...r.kernel.org, Len Brown <len.brown@...el.com>,
	stable@...r.kernel.org,
	Linus Torvalds <torvalds@...ux-foundation.org>,
	Thomas Gleixner <tglx@...utronix.de>,
	Mike Galbraith <efault@....de>, Borislav Petkov <bp@...en8.de>
Subject: Re: [PATCH] x86 idle: repair large-server 50-watt idle-power
 regression


* Peter Zijlstra <peterz@...radead.org> wrote:

> On Thu, Dec 19, 2013 at 06:07:41PM +0100, Ingo Molnar wrote:
> > 
> > * H. Peter Anvin <hpa@...or.com> wrote:
> > 
> > > On 12/19/2013 08:21 AM, Peter Zijlstra wrote:
> > > > 
> > > > What's that mb for?
> > > > 
> > > 
> > > It already exists in mwait_idle_with_hints(); I just moved it into 
> > > this common function.  It is a bit odd, I have to admit; it seems 
> > > like it should be *before* the monitor (and possibly we should have 
> > > one after the CLFLUSH as well?)
> > 
> > Yes, I think we need a barrier before the CLFLUSH, because according 
> > to my reading of the Intel documentation CLFLUSH has no implicit 
> > ordering so it might get reordered with the store to ->flags in 
> > current_set_polling_and_test(), which might result in spurious wakeup 
> > problems again.
> 
> No it cannot; since current_set_polling_and_test() already has a 
> barrier to prevent that.

See below:

> Also, the location patched by hpa doesn't actually call that at all.
> 
> That said, I would find it very strange indeed if a CLFLUSH doesn't 
> also flush the store buffer.

So, the Intel documentation says (sorry about the lazy-link):

  http://www.jaist.ac.jp/iscenter-new/mpc/altix/altixdata/opt/intel/vtune/doc/users_guide/mergedProjects/analyzer_ec/mergedProjects/reference_olh/mergedProjects/instructions/instruct32_hh/vc31.htm

 "CLFLUSH is only ordered by the MFENCE instruction. It is not 
  guaranteed to be ordered by any other fencing, serializing or other 
  CLFLUSH instruction. For example, software can use an MFENCE 
  instruction to insure that previous stores are included in the 
  write-back."

So a specific MFENCE barrier is needed.

Also note that this wording excludes implicit serialization such as 
LOCK prefix or XCHG barriers. As it happens 
current_set_polling_and_test() uses smp_mb(), which happens to map to 
MFENCE on all CPUs that can do CLFLUSH, but that's really just an 
accident and in no way engineered.

_At minimum_ we need a prominent comment at the clflush usage site 
that we rely on the MFENCE in current_set_polling_and_test() ...

> > (And CLFLUSH is a store in a sense, so special in that the regular 
> > ordering for stores does not apply.)
> > 
> > Likewise, having a barrier before the MONITOR looks sensible as 
> > well. Having it _after_ monitor looks weird and is probably wrong. 
> > [It might have been the effects of someone seeing the spurious 
> > wakeup problems with realizing the true source, or so.]
> 
> I again have to disagree, one would expect monitor to flush all that 
> is required to start the monitor -- and it actually does so. As is 
> testified by this extra CLFLUSH being called a bug workaround.

This assumption would be safer - although AFAICS the Intel 
MONITOR/MWAIT documentation is quiet about this aspect.

Thanks,

	Ingo
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ