linux-kernel - Re: [PATCH 06/10] time: Cap clocksource reads to the clocksource max

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Date:	Mon, 12 Jan 2015 10:54:50 -0800
From:	John Stultz <john.stultz@...aro.org>
To:	Richard Cochran <richardcochran@...il.com>
Cc:	Linux Kernel Mailing List <linux-kernel@...r.kernel.org>,
	Dave Jones <davej@...emonkey.org.uk>,
	Linus Torvalds <torvalds@...ux-foundation.org>,
	Thomas Gleixner <tglx@...utronix.de>,
	Prarit Bhargava <prarit@...hat.com>,
	Stephen Boyd <sboyd@...eaurora.org>,
	Ingo Molnar <mingo@...nel.org>,
	Peter Zijlstra <peterz@...radead.org>
Subject: Re: [PATCH 06/10] time: Cap clocksource reads to the clocksource
 max_cycles value

On Sun, Jan 11, 2015 at 4:41 AM, Richard Cochran
<richardcochran@...il.com> wrote:
> On Fri, Jan 09, 2015 at 04:34:24PM -0800, John Stultz wrote:
>> When calculating the current delta since the last tick, we
>> currently have no hard protections to prevent a multiplciation
>> overflow from ocurring.
>
> This is just papering over the problem. The "hard protection" should
> be having a tick scheduled before the range of the clock source is
> exhausted.

So I disagree this is papering over the problem.

You say the tick should be scheduled before the clocksource wraps -
but we have logic to do that.

However there are many ways that can still go wrong.  Virtualization
can delay interrupts for long periods of time, the timer/irq code
isn't the simplest and there can be bugs, or timer hardware itself can
have issues. The difficulty is that when something has gone wrong, the
only thing we have to measure the problem may become corrupted.  And
worse, once the timekeeping code is having problems,  that can result
in bugs that manifest in all sorts of strange ways that are very
difficult to debug (you can't trust your log timestamps, etc).

So I think having some extra measures of protection is useful here.

I'll admit that its always difficult to manage, since we have to layer
our checks, we have circular dependencies (timer code needs
timekeeping to be correct, timekeeping code needs timer code to be
correct), and hardware problems are rampant - so we get trouble like
the clocksource watchdog which uses more trustworthy clocksources to
watch less trustworthy ones, but then hardware starts adding bugs to
the trustworthy ones which cause false positives, etc.   And these
checks make the code add complexity to the code that we'd be happier
without, but we can't throw out supporting the majority of hardware
that have some quirk and imperfection, so I'm not sure what the
alternative should be.

thanks
-john
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/