lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20160624191734.GE30154@twins.programming.kicks-ass.net>
Date:	Fri, 24 Jun 2016 21:17:34 +0200
From:	Peter Zijlstra <peterz@...radead.org>
To:	James Bottomley <James.Bottomley@...senPartnership.com>
Cc:	Davidlohr Bueso <dave@...olabs.net>, mingo@...nel.org,
	davem@...emloft.net, cw00.choi@...sung.com,
	dougthompson@...ssion.com, bp@...en8.de, mchehab@....samsung.com,
	gregkh@...uxfoundation.org, pfg@....com, jikos@...nel.org,
	hans.verkuil@...co.com, awalls@...metrocast.net,
	dledford@...hat.com, sean.hefty@...el.com, kys@...rosoft.com,
	heiko.carstens@...ibm.com, sumit.semwal@...aro.org,
	schwidefsky@...ibm.com, linux-kernel@...r.kernel.org
Subject: Re: [PATCH -tip 00/12] locking/atomics: Add and use inc,dec calls
 for FETCH-OP flavors

On Fri, Jun 24, 2016 at 09:46:05AM -0700, James Bottomley wrote:
> On Mon, 2016-06-20 at 13:05 -0700, Davidlohr Bueso wrote:
> > Hi,
> > 
> > The series is really straightforward and based on Peter's work that
> > introduces[1] the atomic_fetch_$op machinery. Only patch 1 implements
> > the actual atomic_fetch_{inc,dec} calls based on 
> > atomic_fetch_{add,sub}.
> 
> Could I just ask why?  atomic_inc_return(x) - 1 seems a reasonable
> thing to do to me.  Is it because on architectures where atomics are
> implemented in asm, it costs us one more CPU instruction to do the
> extra decrement which gcc can't optimise?   If that's it, I'm not sure
> the added complexity justifies the cycle savings.

That boat has sailed, fetch_$op is implemented (in asm mostly) for _all_
architectures already.

All Davidlohr does here is add fetch_{inc,dec}(v) -> fetch_{add,sub}(1,
v) macros because he's lazy.

In any case, fetch_$op is the natural form of atomics that return a
value; Linux has historically chosen the 'wrong' form. The fetch_$op,
test-and-modify, load-store whatever is what hardware typically does
natively and is what works for irreversible operations.

Sure, for reversible operations (add/sub) what you say can (and is)
done, and then we hope the compiler knows that x-x == 0 (and it
typically does). As you say, that's slightly sub-optimal for archs where
the compiler cannot see into the atomic (typically LL/SC archs).

But add/sub were _2_ lines extra after I did all the groundwork for
fetch_{or,and,xor}. So we might as well save those few extra add/dec
cycles. Some of them are in fairly hot paths.

Lastly; and the weakest argument; fetch_$op is what C11 has, probably
because the above reasons.

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ