lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Message-Id: <20181221180007.GQ4170@linux.ibm.com>
Date:   Fri, 21 Dec 2018 10:00:07 -0800
From:   "Paul E. McKenney" <paulmck@...ux.ibm.com>
To:     Andi Kleen <ak@...ux.intel.com>
Cc:     Peter Zijlstra <peterz@...radead.org>,
        Arnd Bergmann <arnd@...db.de>, Nicolas Pitre <nico@...aro.org>,
        linux-arm-kernel@...ts.infradead.org, linux-kernel@...r.kernel.org,
        Will Deacon <will.deacon@....com>,
        Ingo Molnar <mingo@...nel.org>,
        Thomas Gleixner <tglx@...utronix.de>, hubicka@....cz
Subject: Re: [PATCH 0/7] ARM: hacks for link-time optimization

On Fri, Dec 21, 2018 at 09:20:44AM -0800, Andi Kleen wrote:
> > In particular turning an address-dependency into a control-dependency,
> > which is something allowed by the C language, since it doesn't recognise
> > these concepts as such.
> > 
> > The 'optimization' is allowed currently, but LTO will make it much more
> > likely since it will have a much wider view of things. Esp. when combined
> > with PGO.
> > 
> > Specifically; if you have something like:
> > 
> > int idx;
> > struct object objs[2];
> > 
> > the statement:
> > 
> >   val = objs[idx & 1].ponies;
> > 
> > which you 'need' to be translated like:
> > 
> >   struct object *obj = objs;
> >   obj += (idx & 1);
> >   val = obj->ponies;
> > 
> > Such that the load of obj->ponies depends on the load of idx. However
> > our dear compiler is allowed to make it:
> > 
> >   if (idx & 1)
> >     obj = &objs[1];
> >   else
> >     obj = &objs[0];
> > 
> >   val = obj->ponies;
> 
> I don't see why a compiler would do such an optimization. Clearly
> the second variant is worse than the first, bigger and needs
> branch prediction resources.
> 
> In fact compilers usually try hard to go into the other direction
> and apply if conversion.
> 
> Has anyone seen real world examples of such changes being done, or is this
> all language lawyering theory?

I have not seen it myself, but I have heard others claim to.  For example,
if "idx & 1" had to be computed for some other reason, especially if there
was a pre-exiting "if" statement with this as its condition.  Or if you
have hardware that has a conditional-move instruction.  And so on.

Do you have a way to guarantee that it never happens?

							Thanx, Paul

> -Andi
> 
> > 
> > Because C doesn't recognise this as being different. However this is
> > utterly broken, because in this translation we can speculate the load
> > of obj->ponies such that it no longer depends on the load of idx, which
> > breaks RCU.
> > 
> > Note that further 'optimization' is possible and the compiler could even
> > make it:
> > 
> >   if (idx & 1)
> >     val = objs[1].ponies;
> >   else
> >     val = objs[0].ponies;
> > 
> > Now, granted, this is a fairly artificial example, but it does
> > illustrate the exact problem.
> > 
> > The more the compiler can see of the complete program, the more likely
> > it can make inferrences like this, esp. when coupled with PGO.
> > 
> > Now, we're (usually) very careful to wrap things in READ_ONCE() and
> > rcu_dereference() and the like, which makes it harder on the compiler
> > (because 'volatile' is special), but nothing really stops it from doing
> > this.
> > 
> > Paul has been trying to beat clue into the language people, but given
> > he's been at it for 10 years now, and there's no resolution, I figure we
> > ought to get compiler implementations to give us a knob.
> 

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ