[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <BL2PR07MB2306C87FF3E79B221675F5848D460@BL2PR07MB2306.namprd07.prod.outlook.com>
Date: Sun, 12 Feb 2017 11:27:16 +0000
From: "Mintz, Yuval" <Yuval.Mintz@...ium.com>
To: Richard Cochran <richardcochran@...il.com>
CC: "davem@...emloft.net" <davem@...emloft.net>,
"netdev@...r.kernel.org" <netdev@...r.kernel.org>,
"Kalluru, Sudarsana" <Sudarsana.Kalluru@...ium.com>
Subject: RE: [PATCH net-next v4 1/2] qed: Add infrastructure for PTP support.
> On Sat, Feb 11, 2017 at 09:58:10AM +0100, Richard Cochran wrote:
> > If I am not mistaken, then you can skip the cases val==2 and val==3,
> > because they are equivalent to val==4 and 6.
>
> I took a stab at this, and you can see the result, below. My version has
> lower average error than yours in the interval 1 < ppb < 60000, and it uses
> only 8 64-bit divisions. Outside of that interval, your version has lower error.
>
> So, at the very least, you should introduce a threshold and use this
> algorithm for adjustments under 60 ppm. Better yet, find a way to use fewer
> divisions for adjustments greater and 60 ppm...
>
> Thanks,
> Richard
>
> ---
> #include <stdint.h>
>
> #define TEN9 (1000000000UL)
>
> unsigned int calc_min_integer(uint64_t ppb, uint64_t *M) {
> uint64_t err, m, min, n, N, p2, reg;
>
> min = TEN9;
> for (n = 4; n <= 7; n++) {
> m = n * TEN9;
> m = (m + ppb/2) / ppb;
>
> /*truncate to HW resolution*/
> reg = (m + 8) / 16;
> m = reg * 16;
>
> p2 = (n * TEN9 + m/2) / m;
>
> err = ppb > p2 ? ppb - p2 : p2 - ppb;
>
> if (min >= err) {
> min = err;
> N = n;
> *M = m;
> }
> }
> return N;
> }
Richard, there are quite a bit of inaccuracies in the calculation here.
I think it's best I'll clarify some things about the used algorithm -
Given a fraction represented by (ppb / 10^9), the purpose of
the loop is to best approximate it by a (val / ((16 * period) + 8)
fraction, where val belongs to {0,..,7} and period to {0,...,0xffffff}.
We'd need 'val' and 'period' from the calculation to configure the HW.
The '+8' is not some sort of rounding correction, but rather part
of the required configuration.
Notice that if we wouldn't have needed it, we could have
reduced some of the iterations [since (val / 16 * period) can be
reduced for powers of two]. But given that we do, skipping those
iteration would cause us to lose accuracy.
Your suggestion seems to:
a. Assume that the required period should be in ns, not in
16*ns units.
b. mishandles the +8/-8 in the calculation.
c. Doesn't seem to consider the upper bound on period.
In addition, the rounding-up via ppb is likely to harm accuracy for
higher ppb values.
One thing I still don't get is *why* we're trying to optimize this
area of the code -
I ran a variant of the original code in a loop [on requiring 24
divisions per Call] on various PPB values, and managed close
to 1M/s [single CPU, 64bit, 2.50GHz].
Given that the real use-case is somewhere between 1-16/second,
why do we treat it as if its performance is crucial?
Thanks,
Yuval
Powered by blists - more mailing lists