netdev - RE: [PATCH net-next v4 1/2] qed: Add infrastructure for PTP support.

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <BL2PR07MB2306C87FF3E79B221675F5848D460@BL2PR07MB2306.namprd07.prod.outlook.com>
Date:   Sun, 12 Feb 2017 11:27:16 +0000
From:   "Mintz, Yuval" <Yuval.Mintz@...ium.com>
To:     Richard Cochran <richardcochran@...il.com>
CC:     "davem@...emloft.net" <davem@...emloft.net>,
        "netdev@...r.kernel.org" <netdev@...r.kernel.org>,
        "Kalluru, Sudarsana" <Sudarsana.Kalluru@...ium.com>
Subject: RE: [PATCH net-next v4 1/2] qed: Add infrastructure for PTP support.

> On Sat, Feb 11, 2017 at 09:58:10AM +0100, Richard Cochran wrote:
> > If I am not mistaken, then you can skip the cases val==2 and val==3,
> > because they are equivalent to val==4 and 6.
> 
> I took a stab at this, and you can see the result, below.  My version has
> lower average error than yours in the interval 1 < ppb < 60000, and it uses
> only 8 64-bit divisions.  Outside of that interval, your version has lower error.
> 
> So, at the very least, you should introduce a threshold and use this
> algorithm for adjustments under 60 ppm.  Better yet, find a way to use fewer
> divisions for adjustments greater and 60 ppm...
> 
> Thanks,
> Richard
> 
> ---
> #include <stdint.h>
> 
> #define TEN9		(1000000000UL)
> 
> unsigned int calc_min_integer(uint64_t ppb, uint64_t *M) {
> 	uint64_t err, m, min, n, N, p2, reg;
> 
> 	min = TEN9;
> 	for (n = 4; n <= 7; n++) {
> 		m = n * TEN9;
> 		m = (m + ppb/2) / ppb;
> 
> 		/*truncate to HW resolution*/
> 		reg = (m + 8) / 16;
> 		m = reg * 16;
> 
> 		p2 = (n * TEN9 + m/2) / m;
> 
> 		err = ppb > p2 ? ppb - p2 : p2 - ppb;
> 
> 		if (min >= err) {
> 			min = err;
> 			N = n;
> 			*M = m;
> 		}
> 	}
> 	return N;
> }

Richard, there are quite a bit of inaccuracies in the calculation here.
I think it's best I'll clarify some things about the used algorithm -

Given a fraction represented by (ppb / 10^9), the purpose of
the loop is to best approximate it by a (val / ((16 * period) + 8)
fraction, where val belongs to {0,..,7} and period to {0,...,0xffffff}.
We'd need 'val' and 'period' from the calculation to configure the HW.

The '+8' is not some sort of rounding correction, but rather part
of the required configuration.
Notice that if we wouldn't have needed it, we could have
reduced some of the iterations [since (val / 16 * period) can be
reduced for powers of two]. But given that we do, skipping those
iteration would cause us to lose accuracy.

Your suggestion seems to:
  a. Assume that the required period should be in ns, not in
      16*ns units.
  b. mishandles the +8/-8 in the calculation.
  c. Doesn't seem to consider the upper bound on period.
In addition, the rounding-up via ppb is likely to harm accuracy for
higher ppb values.

One thing I still don't get is *why* we're trying to optimize this
area of the code -
I ran a variant of the original code in a loop [on requiring 24
divisions per Call] on various PPB values, and managed close
to 1M/s [single CPU, 64bit, 2.50GHz].

Given that the real use-case is somewhere between 1-16/second,
why do we treat it as if its performance is crucial?

Thanks,
Yuval