linux-kernel - Re: [PATCH] Make div64_u64() precise on 32bit platforms

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <1287683202.16680.44.camel@pip>
Date:	Thu, 21 Oct 2010 10:46:42 -0700
From:	Brian Behlendorf <behlendorf1@...l.gov>
To:	Oleg Nesterov <oleg@...hat.com>
Cc:	LKML <linux-kernel@...r.kernel.org>,
	Andrew Morton <akpm@...ux-foundation.org>
Subject: Re: [PATCH] Make div64_u64() precise on 32bit platforms

> > > +	if (divisor >> 32 == 0) {
> > > +		if (dividend >> 32 < divisor) {
> > > +			return div_u64_rem(dividend, divisor, &rem);
> > > +		} else {
> > > +			u0 = dividend & 0xFFFFFFFF;
> > > +			quot1 = div_u64_rem(dividend >> 32, divisor, &rem);
> > > +			u0 += ((u64)rem << 32);
> > > +			quot0 = div_u64_rem(u0, divisor, &rem);
> > > +			return (quot1 << 32) + quot0;
> > > +		}
> >
> > Looks correct... but I can't understand these complications.
> > Looks like we can just do
> >
> > 	if ((divisor >> 32) == 0) {
> > 		div_u64(dividend, divisor);
> > 	} else {
> > 	...
> >
> > No?

The idea here, as described in the formal proof, is to cleanly handle
the overflow case.  When I implemented this I assumed the overflow case
would in fact be a problem.  To my surprise your right it doesn't seem
to be causing any trouble.  In practice I can't find any cases where it
is a problem on i386.

> > I can't understand this "dividend >> 1". It seems to me that
> >
> > 		quot1 = div_u64(dividend, (divisor << n) >> 32);
> > 		quot0 = (quot1 << n) >> 32;
> >
> > should be equally correct. Or I missed some overflow?
> 
> Thinking more about this with a fresh head, we don't event need quot1,
> unless I missed something. We can do
> 
> 		quot0 = div_u64((dividend << n) >> 32, (divisor << n) >> 32);
> 
> instead. Or, better,
> 
> 		n = 32 - __builtin_clzll(divisor);
> 		quot0 = div_u64(dividend >> n, divisor >> n);
> 
> And 32 - clzll == fls.

Once again, the extra complexity was only there to handle to overflow
case.

> So, I think it can be really trivial, see the test-case below,
> seems to work (you need 64bit machine to test).
> 
> What do you think? I do not trust my math skills.

I think we should use your simpler version.  There's no good reason to
make this more complicated than it needs to be.  I haven't been able to
find a test case where your changes get the wrong result.

The updated patch is against linux-2.6.35 and passes all the previous
test cases.

Thanks,
Brian

View attachment "0001-Fix-div64_u64-for-32bit-platforms.patch" of type "text/x-patch" (3573 bytes)