linux-kernel - Re: [PATCH] clocksource, prevent overflow in clocksource

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <alpine.LFD.2.02.1204070107370.2542@ionos>
Date:	Sat, 7 Apr 2012 01:29:59 +0200 (CEST)
From:	Thomas Gleixner <tglx@...utronix.de>
To:	Prarit Bhargava <prarit@...hat.com>
cc:	John Stultz <johnstul@...ibm.com>, linux-kernel@...r.kernel.org,
	Salman Qazi <sqazi@...gle.com>, stable@...nel.org
Subject: Re: [PATCH] clocksource, prevent overflow in clocksource_cyc2ns

On Thu, 5 Apr 2012, Prarit Bhargava wrote:

> > 
> > So what kernel version are you using?
> 
> I retested using top of the linux.git tree, running
> 
> echo 1 > /proc/sys/kernel/sysrq
> for i in `seq 10000`; do sleep 1000 & done
> echo t > /proc/sysrq-trigger
> 
> and I no longer see a problem.  However, if I increase the number of threads to
> 1000/cpu I get
> 
> Clocksource %s unstable (delta = -429565427)
> Clocksource switching to hpet

You are issuing a command which puts the kernel into a state where is
dumps data for several seconds with interrupts disabled. And you expect that
everything can cope with that?

> If I hack in (sorry for the cut-and-paste)
> ....
> +               cs_nsec = mult_frac(((csnow - cs->cs_last), cs->mult,
> +                                   1UL << cs->shift);
> 
> -               cs_nsec = clocksource_cyc2ns((csnow - cs->cs_last) &
> -                                            cs->mask, cs->mult, cs->shift);
> then I don't see unstable messages.

That does not make your approach more correct. The HPET wraparound
time is ~3 seconds, so you screwed everything already, when your dump
lasts longer than that. And there are clocksources which wrap way
faster.

No, you can't fix that by hacking the timer code. A wraparound CANNOT
be fixed by hacks.

So instead of fiddling in the victims, please fix the root cause,
i.e. that stupid sysrq-t code which should not need to have interrupts
disabled just to dump all that state. If that's not possible, send a
patch to the sysrq documentation and warn about the consequences.

But stay away from code which is correct already. You CANNOT fix a
problem which is caused by abnormal system state by hacking the code
which is exposing the problem.

All you do is making hot pathes more expensive with a very dubious
value. The time related calls are hotpath functions and optimized.

Aside of that you are breaking all architectures which do not have a
native 64/32 instruction.

This mult_frac stuff is not going to happen, period.

Thanks,

	tglx
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/