[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <1287683202.16680.44.camel@pip>
Date: Thu, 21 Oct 2010 10:46:42 -0700
From: Brian Behlendorf <behlendorf1@...l.gov>
To: Oleg Nesterov <oleg@...hat.com>
Cc: LKML <linux-kernel@...r.kernel.org>,
Andrew Morton <akpm@...ux-foundation.org>
Subject: Re: [PATCH] Make div64_u64() precise on 32bit platforms
> > > + if (divisor >> 32 == 0) {
> > > + if (dividend >> 32 < divisor) {
> > > + return div_u64_rem(dividend, divisor, &rem);
> > > + } else {
> > > + u0 = dividend & 0xFFFFFFFF;
> > > + quot1 = div_u64_rem(dividend >> 32, divisor, &rem);
> > > + u0 += ((u64)rem << 32);
> > > + quot0 = div_u64_rem(u0, divisor, &rem);
> > > + return (quot1 << 32) + quot0;
> > > + }
> >
> > Looks correct... but I can't understand these complications.
> > Looks like we can just do
> >
> > if ((divisor >> 32) == 0) {
> > div_u64(dividend, divisor);
> > } else {
> > ...
> >
> > No?
The idea here, as described in the formal proof, is to cleanly handle
the overflow case. When I implemented this I assumed the overflow case
would in fact be a problem. To my surprise your right it doesn't seem
to be causing any trouble. In practice I can't find any cases where it
is a problem on i386.
> > I can't understand this "dividend >> 1". It seems to me that
> >
> > quot1 = div_u64(dividend, (divisor << n) >> 32);
> > quot0 = (quot1 << n) >> 32;
> >
> > should be equally correct. Or I missed some overflow?
>
> Thinking more about this with a fresh head, we don't event need quot1,
> unless I missed something. We can do
>
> quot0 = div_u64((dividend << n) >> 32, (divisor << n) >> 32);
>
> instead. Or, better,
>
> n = 32 - __builtin_clzll(divisor);
> quot0 = div_u64(dividend >> n, divisor >> n);
>
> And 32 - clzll == fls.
Once again, the extra complexity was only there to handle to overflow
case.
> So, I think it can be really trivial, see the test-case below,
> seems to work (you need 64bit machine to test).
>
> What do you think? I do not trust my math skills.
I think we should use your simpler version. There's no good reason to
make this more complicated than it needs to be. I haven't been able to
find a test case where your changes get the wrong result.
The updated patch is against linux-2.6.35 and passes all the previous
test cases.
Thanks,
Brian
View attachment "0001-Fix-div64_u64-for-32bit-platforms.patch" of type "text/x-patch" (3573 bytes)
Powered by blists - more mailing lists