lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20091110211259.GD26633@1wt.eu>
Date:	Tue, 10 Nov 2009 22:12:59 +0100
From:	Willy Tarreau <w@....eu>
To:	Pavel Machek <pavel@....cz>
Cc:	"H. Peter Anvin" <hpa@...or.com>, Avi Kivity <avi@...hat.com>,
	Alan Cox <alan@...rguk.ukuu.org.uk>,
	Matteo Croce <technoboy85@...il.com>,
	Sven-Haegar Koch <haegar@...net.de>,
	Ingo Molnar <mingo@...e.hu>, linux-kernel@...r.kernel.org
Subject: Re: i686 quirk for AMD Geode

On Tue, Nov 10, 2009 at 09:54:45PM +0100, Pavel Machek wrote:
> Hi!
> 
> > Indeed, but there is a difference between [cmpxchg, bswap, cmov, nopl]
> > on one side and [sse*] on the other : distros are built assuming the
> > former are always available while they are not always. And the
> > distro
> 
> Well, fix the distros... 

you know like me that it's as easy as useless to point the finger at
distros, because people running on low end want something that works
and people running on high end want something that runs fast. In order
to satisfy every one, you would have to build with optimizations for
every CPU around, which does not make sense. Simply count the number
of CPU variants in the kernel, and imagine that many CDs/DVDs for a
single platform distro.

However, targetting the most common denominator of high end machines
(basically i686) and having the lower end systems experience a tiny
slowdown is not stupid at all since performance is not what matters
the most there. The higher end systems will simply be able to run
CPU-specific optimizations per-program as they already do right now.

(...)
> > CMOV/NOPL are rarely used, thus have no reason to cause a massive
> > performance drop, but are frequent enough (at least cmov) for almost
> 
> *One* CMOV in the inner loop will make your performance go down 20x.

yes, just like with emulated FPU or trapped unaligned accesses. It's
just like flying fishes. They exist but they aren't the most common
ones. If people encounter these cases on a specific program, then
they just have to recompile it if it is a problem. At least they
don't rebuild the whole distro. And once again, I've been using
cmpxchg/bswap emulation for years on my i386 without feeling any
need for a rebuild, and CMOV emulation for years now on my mini-itx 
C3 without any problem either. These are real experiences, not just
fears of imaginary problems. Yes I can design a program to run 400
times slower on these machines if I want. I just don't feel the need
to do so and apparently existing programs' authors didn't either.

Regards,
Willy

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ