linux-kernel - Re: [PATCH mmotm] fix broken bootup on 32-bit

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [day] [month] [year] [list]

Message-ID: <AANLkTinTvZZydNgg7CiD-wMjHXhjnK_-m_wdY5O9paW4@mail.gmail.com>
Date:	Sun, 6 Mar 2011 10:01:35 -0800
From:	Hugh Dickins <hughd@...gle.com>
To:	Michał Nazarewicz <mina86@...a86.com>
Cc:	Andrew Morton <akpm@...ux-foundation.org>,
	"Douglas W. Jones" <jones@...uiowa.edu>,
	linux-kernel@...r.kernel.org,
	Denis Vlasenko <vda.linux@...glemail.com>
Subject: Re: [PATCH mmotm] fix broken bootup on 32-bit

On Sat, Mar 5, 2011 at 2:09 PM, Michał Nazarewicz <mina86@...a86.com> wrote:
> On Mar 5, 2011 8:49 PM, "Hugh Dickins" <hughd@...gle.com> wrote:
>> I realize that zeroes are handled, but I was imagining that one branch
>> taken (for numbers up to 9999) is cheaper than four out-of-line function
>> calls, six divisions-or-modulos by constant 10000, three multiplications
>> by constants; oh, and a lot more once I look inside put_dec_full4().
>>
>> Is that not the case?  Isn't performance the justification for this magic?
>
> It turns out that difference in speed is minimal and inconclusive, as the
> version without cascading ifs seems to perform better on ARM. So because my
> benchmarks didn't show a clear winner, we can go with a shorter version.

At first I was surprised by that, but now I'm suspecting that it's a
severe flaw in your benchmarking.

Am I right to think that you are measuring the performance of the
algorithms on random unsigned long longs?  Which are very unlikely to
have all the upper 16 bits unset?  Let alone the upper 32 bits or the
upper 48 bits all unset?

Whereas, what would be the distribution of numbers that the kernel is
typically called upon to vsprintf?  I put it to you that numbers with
the upper 48 bits all unset would predominate, followed by those with
just the upper 32 bits unset.

I'm sure there are u64s and s64s and unsigned long longs and long
longs to be found and printed, but the mm statistics I just looked up
appear to be merely unsigned longs, just 32 bits on 32-bit; and even
in the 64-bit case, I'd still expect that lower numbers would
generally predominate.

I suspect that, without more branching than you have at present, your
new algorithms actually slow down the kernel.

Hugh
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/