[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-Id: <200808211611.17889.nickpiggin@yahoo.com.au>
Date: Thu, 21 Aug 2008 16:11:17 +1000
From: Nick Piggin <nickpiggin@...oo.com.au>
To: Peter Zijlstra <peterz@...radead.org>
Cc: Ray Lee <ray-lk@...rabbit.org>, adobriyan@...il.com,
Ingo Molnar <mingo@...e.hu>,
"Zhang, Yanmin" <yanmin_zhang@...ux.intel.com>,
Dhaval Giani <dhaval@...ux.vnet.ibm.com>,
LKML <linux-kernel@...r.kernel.org>,
Srivatsa Vaddagiri <vatsa@...ux.vnet.ibm.com>,
Aneesh Kumar KV <aneesh.kumar@...ux.vnet.ibm.com>,
Balbir Singh <balbir@...ibm.com>,
Chris Friesen <cfriesen@...tel.com>
Subject: Re: VolanoMark regression with 2.6.27-rc1
On Thursday 21 August 2008 06:56, Peter Zijlstra wrote:
> On Wed, 2008-08-20 at 22:30 +0200, Peter Zijlstra wrote:
> > works for the above example, but when I make it long long, so as to
> > match the longest supported type, it goes boom again - for as of yet
> > unknown reasons.
>
> Ok, people pointed out I got my promotion rules mixed up, I casted the
> result of the division to signed, instead of ending up with a signed
> division.
>
> #define avg(x, y) ({ \
> typeof(x) _avg1 = (x); \
> typeof(y) _avg2 = (y); \
> (void) (&_avg1 == &_avg2); \
> (typeof(x))(_avg1 + ((long long)_avg2 - _avg1)/2); })
>
> seems to work.
Right, I guess that will work, but unfortunately the code gen on 32-bit
is a monstrosity. If you're going to cast to 64-bit anyway, we might as
well then just do the normal add rather than playing the game to avoid
overflow.
Secondly, this is operating on the fixed point scaled load numbers, so in
the case of the scheduler I wouldn't worry too much about rounding... also
in most integer operations, rounding down is less surprising than rounding
up like the last code did.
I still don't know whether it is appropriate to put it into kernel.h
(because of rounding, and variability when it comes to what type size will
hold the sum of parameters), but for the scheduler, I would use this:
((unsigned long long)a + b) / 2;
Which gives this on 32-bit:
div:
movl %edx, %ecx
xorl %edx, %edx
pushl %ebx
xorl %ebx, %ebx
addl %ecx, %eax
adcl %ebx, %edx
popl %ebx
shrdl $1, %edx, %eax
shrl %edx
ret
Rather than this:
div:
subl $8, %esp
xorl %ecx, %ecx
movl %ebx, (%esp)
movl %edx, %ebx
movl %esi, 4(%esp)
xorl %esi, %esi
subl %eax, %ebx
sbbl %ecx, %esi
movl %esi, %ecx
movl %esi, %edx
sarl $31, %ecx
movl %ecx, %edx
xorl %ecx, %ecx
shrl $31, %edx
addl %ebx, %edx
movl (%esp), %ebx
adcl %esi, %ecx
movl 4(%esp), %esi
addl $8, %esp
shrdl $1, %ecx, %edx
addl %edx, %eax
sarl %ecx
ret
And it's also slightly better on 64-bit.
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
Powered by blists - more mailing lists