lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [day] [month] [year] [list]
Message-ID: <AANLkTinTvZZydNgg7CiD-wMjHXhjnK_-m_wdY5O9paW4@mail.gmail.com>
Date:	Sun, 6 Mar 2011 10:01:35 -0800
From:	Hugh Dickins <hughd@...gle.com>
To:	Michał Nazarewicz <mina86@...a86.com>
Cc:	Andrew Morton <akpm@...ux-foundation.org>,
	"Douglas W. Jones" <jones@...uiowa.edu>,
	linux-kernel@...r.kernel.org,
	Denis Vlasenko <vda.linux@...glemail.com>
Subject: Re: [PATCH mmotm] fix broken bootup on 32-bit

On Sat, Mar 5, 2011 at 2:09 PM, Michał Nazarewicz <mina86@...a86.com> wrote:
> On Mar 5, 2011 8:49 PM, "Hugh Dickins" <hughd@...gle.com> wrote:
>> I realize that zeroes are handled, but I was imagining that one branch
>> taken (for numbers up to 9999) is cheaper than four out-of-line function
>> calls, six divisions-or-modulos by constant 10000, three multiplications
>> by constants; oh, and a lot more once I look inside put_dec_full4().
>>
>> Is that not the case?  Isn't performance the justification for this magic?
>
> It turns out that difference in speed is minimal and inconclusive, as the
> version without cascading ifs seems to perform better on ARM. So because my
> benchmarks didn't show a clear winner, we can go with a shorter version.

At first I was surprised by that, but now I'm suspecting that it's a
severe flaw in your benchmarking.

Am I right to think that you are measuring the performance of the
algorithms on random unsigned long longs?  Which are very unlikely to
have all the upper 16 bits unset?  Let alone the upper 32 bits or the
upper 48 bits all unset?

Whereas, what would be the distribution of numbers that the kernel is
typically called upon to vsprintf?  I put it to you that numbers with
the upper 48 bits all unset would predominate, followed by those with
just the upper 32 bits unset.

I'm sure there are u64s and s64s and unsigned long longs and long
longs to be found and printed, but the mm statistics I just looked up
appear to be merely unsigned longs, just 32 bits on 32-bit; and even
in the 64-bit case, I'd still expect that lower numbers would
generally predominate.

I suspect that, without more branching than you have at present, your
new algorithms actually slow down the kernel.

Hugh
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ