[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Message-ID: <5078q5p1-84p2-3p40-p739-r26rs40sss1q@syhkavp.arg>
Date: Fri, 5 Jul 2024 08:21:53 -0400 (EDT)
From: Nicolas Pitre <nico@...xnic.net>
To: Uwe Kleine-König <u.kleine-koenig@...libre.com>
cc: Andrew Morton <akpm@...ux-foundation.org>, linux-kernel@...r.kernel.org
Subject: Re: [PATCH v2 1/2] mul_u64_u64_div_u64: make it precise always
On Fri, 5 Jul 2024, Uwe Kleine-König wrote:
> On Thu, Jul 04, 2024 at 05:16:37PM -0400, Nicolas Pitre wrote:
> > Then enter the loop:
> >
> > carry = n_hi >> 63;
> >
> > Top bit of n_hi is unset so no carry.
> >
> > shift = carry ? 1 : __builtin_clzll(n_hi);
> >
> > If n'hi's top bit was set we'd have a shift of 1 with a carry. But here
> > there is no carry and aligning n_hi to the top left bit requires a shift
> > of 12.
> >
> > n_hi <<= shift;
> > n_hi |= n_lo >> (64 - shift);
> > n_lo <<= shift;
> >
> > So n_hi is now 0xffffffff00000000
> >
> > p -= shift;
> >
> > Shifting left the dividend reduces the divisor's power.
> > So p is now 64 + 8 - 12 = 60
> >
> > Then, the crux of the operation:
> >
> > if (carry || (n_hi >= c)) {
> > n_hi -= c;
> > res |= 1ULL << p;
> > }
> >
> > So... if the divisor fits then we add a 1 to the result and subtract it.
> > n_hi = 0xffffffff00000000 - 0xffffff0000000000 = 0x000000ff00000000
> > res |= 1 << 60
> >
> > Let's loop again:
> >
> > carry = n_hi >> 63;
> > shift = carry ? 1 : __builtin_clzll(n_hi);
> > ...
> >
> > No carry, shift becomes 24, p becomes 60 - 24 = 36 and
> > n_hi becomes 0xff00000000000000.
> >
> > if (carry || (n_hi >= c)) { ...
> >
> > No carry and n_hi is smaller than c so loop again.
> >
> > carry = n_hi >> 63;
> > shift = carry ? 1 : __builtin_clzll(n_hi);
> >
> > This time we have a carry as the top bit of n_hi is set and we're about
> > to shift it by 1. p becomes 35 and n_hi becomes 0xfe00000000000000. In
> > reality it is like having 0x1fe00000000000000 (a 65-bits value) which is
> > obviously bigger than 0xffffff0000000000. So we can augment the result
> > and subtract. Thanks to two's complement, we have:
> >
> > n_hi = 0xfe00000000000000 - 0xffffff0000000000 = 0xfe00010000000000
> >
> > and
> >
> > res = 1 << 60 | 1 << 35
>
> Oh wow, that part is clever. Before your mail I wondered for a while why
> the right thing happens if carry=1 but n_hi < c.
>
> > And so on until either n_hi becomes 0 or p would go negative, which
> > might happen quite quickly in some cases.
>
> OK, so the loop invariant at the end of each iteration is:
>
> final result = res + (n_hi#n_lo << p) / c
>
> (with n_hi#n_lo = n_hi << 64 | n_lo), right?
Hmmm maybe. I hardly think in such terms so I can't say for sure if this
is right.
Nicolas
Powered by blists - more mailing lists