linux-kernel - Re: Linux 2.6.29-rc6

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Date:	Tue, 17 Mar 2009 17:40:51 +0100
From:	Ingo Molnar <mingo@...e.hu>
To:	Linus Torvalds <torvalds@...ux-foundation.org>
Cc:	Peter Zijlstra <a.p.zijlstra@...llo.nl>,
	Jesper Krogh <jesper@...gh.cc>,
	john stultz <johnstul@...ibm.com>,
	Thomas Gleixner <tglx@...utronix.de>,
	Linux Kernel Mailing List <linux-kernel@...r.kernel.org>,
	Len Brown <len.brown@...el.com>
Subject: Re: Linux 2.6.29-rc6


* Linus Torvalds <torvalds@...ux-foundation.org> wrote:

> On Tue, 17 Mar 2009, Ingo Molnar wrote:
> > 
> > That's the idea of my patch: to use not two endpoints but thousands 
> > of measurement points.
> 
> Umm. Except you don't.
> 
> > By measuring more we can get a more precise result, and we also do 
> > not assume anything about how much time passes between two 
> > measurement points.
> 
> That's fine, but your actual code doesn't _do_ that.
> 
> > My 'delta' algorithm does not assume anything about how much time 
> > passes between two measurement points - it calculates the slope and 
> > keeps a rolling average of that slope.
> 
> No, you keep a very bad measure of "some kind of random average of the 
> last few points", which - if I read things right:
> 
>  - lacks precision (you really need to use 'double' floating point to do 
>    it well, otherwise the rounding errors will kill you). You seem to be 
>    aiming for a 10-bit fixed point thing, which may or may not work if 
>    done cleverly, but:
> 
>  - seems to be based on a rather weak averaging function which certainly 
>    will lose data over time.
> 
> The thing is, the only _accurate_ average is the one done over 
> long time distances. It's very true that your slope thing works 
> very well over such long times, and you'd get accurate measurement 
> if you did it that way, BUT THAT IS NOT WHAT YOU DO. You have a 
> very tight loop, so you get very bad slopes, and then you use a 
> weak averaging function to try to make them better, but it never 
> does.

Hm, the intention there was to have a memory of ~1000 entries via a 
decaying average of 1:1000.

In parallel to that there's also a noise estimator (which too decays 
over time). So basically when observed noise is very low we 
essentially use the data from the last ~1000 measurements. (well, 
not exactly - as the 'memory' of more recent data will be stronger 
than that of older ones.)

Again ... it's a clearly non-working patch so it's not really a 
defendable concept :-)

> Also, there seems to be a fundamental bug in your PIT reading 
> routine. My fast-TSC calibration only looks at the MSB of the PIT 
> read for a very good reason: if you don't use the explicit LATCH 
> command, you may be getting the MSB of one counter value, and then 
> the LSB of another. So your PIT read can easily be off by ~256 PIT 
> cycles. Only by caring only for the MSB can you do an unlatched 
> read!
> 
> That is why pit_expect_msb() looks for the "edge" where the MSB 
> changes, and never actually looks at the LSB.
> 
> This issue may be an additional reason for your problems, although 
> maybe your noise correction will be able to avoid those cases.

indeed. I did check the trace results though via gnuplot yesterday 
(suspectig PIT readout outliers) and there were no outliers.

For any final patch it's still a showstopper issue.

But the source of error and miscalibration is elsewhere.

	Ingo
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/