linux-kernel - Re: VolanoMark regression with 2.6.27-rc1

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-Id: <200808211611.17889.nickpiggin@yahoo.com.au>
Date:	Thu, 21 Aug 2008 16:11:17 +1000
From:	Nick Piggin <nickpiggin@...oo.com.au>
To:	Peter Zijlstra <peterz@...radead.org>
Cc:	Ray Lee <ray-lk@...rabbit.org>, adobriyan@...il.com,
	Ingo Molnar <mingo@...e.hu>,
	"Zhang, Yanmin" <yanmin_zhang@...ux.intel.com>,
	Dhaval Giani <dhaval@...ux.vnet.ibm.com>,
	LKML <linux-kernel@...r.kernel.org>,
	Srivatsa Vaddagiri <vatsa@...ux.vnet.ibm.com>,
	Aneesh Kumar KV <aneesh.kumar@...ux.vnet.ibm.com>,
	Balbir Singh <balbir@...ibm.com>,
	Chris Friesen <cfriesen@...tel.com>
Subject: Re: VolanoMark regression with 2.6.27-rc1

On Thursday 21 August 2008 06:56, Peter Zijlstra wrote:
> On Wed, 2008-08-20 at 22:30 +0200, Peter Zijlstra wrote:

> > works for the above example, but when I make it long long, so as to
> > match the longest supported type, it goes boom again - for as of yet
> > unknown reasons.
>
> Ok, people pointed out I got my promotion rules mixed up, I casted the
> result of the division to signed, instead of ending up with a signed
> division.
>
> #define avg(x, y) ({                            \
>         typeof(x) _avg1 = (x);                  \
>         typeof(y) _avg2 = (y);                  \
>         (void) (&_avg1 == &_avg2);              \
>         (typeof(x))(_avg1 + ((long long)_avg2 - _avg1)/2); })
>
> seems to work.

Right, I guess that will work, but unfortunately the code gen on 32-bit
is a monstrosity. If you're going to cast to 64-bit anyway, we might as
well then just do the normal add rather than playing the game to avoid
overflow.

Secondly, this is operating on the fixed point scaled load numbers, so in
the case of the scheduler I wouldn't worry too much about rounding... also
in most integer operations, rounding down is less surprising than rounding
up like the last code did.

I still don't know whether it is appropriate to put it into kernel.h
(because of rounding, and variability when it comes to what type size will
hold the sum of parameters), but for the scheduler, I would use this:

	((unsigned long long)a + b) / 2;

Which gives this on 32-bit:
div:
        movl    %edx, %ecx
        xorl    %edx, %edx
        pushl   %ebx
        xorl    %ebx, %ebx
        addl    %ecx, %eax
        adcl    %ebx, %edx
        popl    %ebx
        shrdl   $1, %edx, %eax
        shrl    %edx
        ret

Rather than this:
div:
        subl    $8, %esp
        xorl    %ecx, %ecx
        movl    %ebx, (%esp)
        movl    %edx, %ebx
        movl    %esi, 4(%esp)
        xorl    %esi, %esi
        subl    %eax, %ebx
        sbbl    %ecx, %esi
        movl    %esi, %ecx
        movl    %esi, %edx
        sarl    $31, %ecx
        movl    %ecx, %edx
        xorl    %ecx, %ecx
        shrl    $31, %edx
        addl    %ebx, %edx
        movl    (%esp), %ebx
        adcl    %esi, %ecx
        movl    4(%esp), %esi
        addl    $8, %esp
        shrdl   $1, %ecx, %edx
        addl    %edx, %eax
        sarl    %ecx
        ret

And it's also slightly better on 64-bit.
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/