linux-kernel - Re: [PATCH v2 1/2] mul_u64_u64_div

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [day] [month] [year] [list]

Message-ID: <5078q5p1-84p2-3p40-p739-r26rs40sss1q@syhkavp.arg>
Date: Fri, 5 Jul 2024 08:21:53 -0400 (EDT)
From: Nicolas Pitre <nico@...xnic.net>
To: Uwe Kleine-König <u.kleine-koenig@...libre.com>
cc: Andrew Morton <akpm@...ux-foundation.org>, linux-kernel@...r.kernel.org
Subject: Re: [PATCH v2 1/2] mul_u64_u64_div_u64: make it precise always

On Fri, 5 Jul 2024, Uwe Kleine-König wrote:

> On Thu, Jul 04, 2024 at 05:16:37PM -0400, Nicolas Pitre wrote:
> > Then enter the loop:
> > 
> >                 carry = n_hi >> 63;
> > 
> > Top bit of n_hi is unset so no carry.
> > 
> >                 shift = carry ? 1 : __builtin_clzll(n_hi);
> > 
> > If n'hi's top bit was set we'd have a shift of 1 with a carry. But here 
> > there is no carry and aligning n_hi to the top left bit requires a shift 
> > of 12.
> > 
> >                 n_hi <<= shift;
> >                 n_hi |= n_lo >> (64 - shift);
> >                 n_lo <<= shift;
> > 
> > So n_hi is now 0xffffffff00000000
> > 
> >                 p -= shift;
> > 
> > Shifting left the dividend reduces the divisor's power.
> > So p is now 64 + 8 - 12 = 60
> > 
> > Then, the crux of the operation:
> > 
> >                 if (carry || (n_hi >= c)) {
> >                         n_hi -= c;
> >                         res |= 1ULL << p;
> >                 }
> > 
> > So... if the divisor fits then we add a 1 to the result and subtract it.
> > n_hi = 0xffffffff00000000 - 0xffffff0000000000 = 0x000000ff00000000
> > res |= 1 << 60
> > 
> > Let's loop again:
> > 
> >                 carry = n_hi >> 63;
> >                 shift = carry ? 1 : __builtin_clzll(n_hi);
> >                 ...
> > 
> > No carry, shift becomes 24, p becomes 60 - 24 = 36 and
> > n_hi becomes 0xff00000000000000.
> > 
> >                 if (carry || (n_hi >= c)) { ...
> > 
> > No carry and n_hi is smaller than c so loop again.
> > 
> >                 carry = n_hi >> 63;
> >                 shift = carry ? 1 : __builtin_clzll(n_hi);
> > 
> > This time we have a carry as the top bit of n_hi is set and we're about 
> > to shift it by 1. p becomes 35 and n_hi becomes 0xfe00000000000000. In 
> > reality it is like having 0x1fe00000000000000 (a 65-bits value) which is 
> > obviously bigger than 0xffffff0000000000. So we can augment the result 
> > and subtract. Thanks to two's complement, we have:
> > 
> > n_hi = 0xfe00000000000000 - 0xffffff0000000000 = 0xfe00010000000000
> > 
> > and 
> > 
> > res = 1 << 60 | 1 << 35
> 
> Oh wow, that part is clever. Before your mail I wondered for a while why
> the right thing happens if carry=1 but n_hi < c.
> 
> > And so on until either n_hi becomes 0 or p would go negative, which 
> > might happen quite quickly in some cases.
> 
> OK, so the loop invariant at the end of each iteration is:
> 
> 	final result = res + (n_hi#n_lo << p) / c
> 
> (with n_hi#n_lo = n_hi << 64 | n_lo), right?

Hmmm maybe. I hardly think in such terms so I can't say for sure if this 
is right.


Nicolas