netdev - Re: [PATCH 0/24] make atomic_read() behave consistently across all architectures

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <alpine.LFD.0.999.0708241001560.25853@woody.linux-foundation.org>
Date:	Fri, 24 Aug 2007 10:19:50 -0700 (PDT)
From:	Linus Torvalds <torvalds@...ux-foundation.org>
To:	Denys Vlasenko <vda.linux@...glemail.com>
cc:	Satyam Sharma <satyam@...radead.org>,
	Christoph Lameter <clameter@....com>,
	"Paul E. McKenney" <paulmck@...ux.vnet.ibm.com>,
	Herbert Xu <herbert@...dor.apana.org.au>,
	Nick Piggin <nickpiggin@...oo.com.au>,
	Paul Mackerras <paulus@...ba.org>,
	Segher Boessenkool <segher@...nel.crashing.org>,
	heiko.carstens@...ibm.com, horms@...ge.net.au,
	linux-kernel@...r.kernel.org, rpjday@...dspring.com, ak@...e.de,
	netdev@...r.kernel.org, cfriesen@...tel.com,
	akpm@...ux-foundation.org, jesper.juhl@...il.com,
	linux-arch@...r.kernel.org, zlynx@....org, schwidefsky@...ibm.com,
	Chris Snook <csnook@...hat.com>, davem@...emloft.net,
	wensong@...ux-vs.org, wjiang@...ilience.com
Subject: Re: [PATCH 0/24] make atomic_read() behave consistently across all
 architectures

On Fri, 24 Aug 2007, Denys Vlasenko wrote:
>
> > No, you don't use "x.counter++". But you *do* use
> >
> > 	if (atomic_read(&x) <= 1)
> >
> > and loading into a register is stupid and pointless, when you could just
> > do it as a regular memory-operand to the cmp instruction.
> 
> It doesn't mean that (volatile int*) cast is bad, it means that current gcc
> is bad (or "not good enough"). IOW: instead of avoiding volatile cast,
> it's better to fix the compiler.

I would agree that fixing the compiler in this case would be a good thing, 
even quite regardless of any "atomic_read()" discussion.

I just have a strong suspicion that "volatile" performance is so low down 
the list of any C compiler persons interest, that it's never going to 
happen. And quite frankly, I cannot blame the gcc guys for it.

That's especially as "volatile" really isn't a very good feature of the C 
language, and is likely to get *less* interesting rather than more (as 
user space starts to be more and more threaded, "volatile" gets less and 
less useful.

[ Ie, currently, I think you can validly use "volatile" in a "sigatomic_t" 
  kind of way, where there is a single thread, but with asynchronous 
  events. In that kind of situation, I think it's probably useful. But 
  once you get multiple threads, it gets pointless.

  Sure: you could use "volatile" together with something like Dekker's or 
  Peterson's algorithm that doesn't depend on cache coherency (that's 
  basically what the C "volatile" keyword approximates: not atomic 
  accesses, but *uncached* accesses! But let's face it, that's way past 
  insane. ]

So I wouldn't expect "volatile" to ever really generate better code. It 
might happen as a side effect of other improvements (eg, I might hope that 
the SSA work would eventually lead to gcc having a much better defined 
model of valid optimizations, and maybe better code generation for 
volatile accesses fall out cleanly out of that), but in the end, it's such 
an ugly special case in C, and so seldom used, that I wouldn't depend on 
it.

> Linus, in all honesty gcc has many more cases of suboptimal code,
> case of "volatile" is just one of many.

Well, the thing is, quite often, many of those "suboptimal code" 
generations fall into two distinct classes:

 - complex C code. I can't really blame the compiler too much for this. 
   Some things are *hard* to optimize, and for various scalability 
   reasons, you often end up having limits in the compiler where it 
   doesn't even _try_ doing certain optimizations if you have excessive 
   complexity.

 - bad register allocation. Register allocation really is hard, and 
   sometimes gcc just does the "obviously wrong" thing, and you end up 
   having totally unnecessary spills.

> Off the top of my head:

Yes, "unsigned long long" with x86 has always generated atrocious code. In 
fact, I would say that historically it was really *really* bad. These 
days, gcc actually does a pretty good job, but I'm not surprised that it's 
still quite possible to find cases where it did some optimization (in this 
case, apparently noticing that "shift by >= 32 bits" causes the low 
register to be pointless) and then missed *another* optimization (better 
register use) because that optimization had been done *before* the first 
optimization was done.

That's a *classic* example of compiler code generation issues, and quite 
frankly, I think that's very different from the issue of "volatile".

Quite frankly, I'd like there to be more competition in the open source 
compiler game, and that might cause some upheavals, but on the whole, gcc 
actually does a pretty damn good job. 

			Linus
-
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html