linux-kernel - Re: [PATCH] timekeeping: Change type of nsec variable to unsigned in its calculation.

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Date:   Thu, 1 Dec 2016 13:19:00 -0800
From:   John Stultz <john.stultz@...aro.org>
To:     Thomas Gleixner <tglx@...utronix.de>
Cc:     David Gibson <david@...son.dropbear.id.au>,
        lkml <linux-kernel@...r.kernel.org>,
        Liav Rehana <liavr@...lanox.com>,
        Chris Metcalf <cmetcalf@...lanox.com>,
        Richard Cochran <richardcochran@...il.com>,
        Ingo Molnar <mingo@...nel.org>,
        Prarit Bhargava <prarit@...hat.com>,
        Laurent Vivier <lvivier@...hat.com>,
        "Christopher S . Hall" <christopher.s.hall@...el.com>,
        "4.6+" <stable@...r.kernel.org>,
        Peter Zijlstra <peterz@...radead.org>
Subject: Re: [PATCH] timekeeping: Change type of nsec variable to unsigned in
 its calculation.

On Thu, Dec 1, 2016 at 12:46 PM, Thomas Gleixner <tglx@...utronix.de> wrote:
> On Thu, 1 Dec 2016, John Stultz wrote:
>> I would also suggest:
>> 3) If the systems are halted for longer then the timekeeping core
>> expects, the system will "miss" or "lose" some portion of that halted
>> time, but otherwise the system will function properly.  Which is the
>> result with this patch.
>
> Wrong. This is not the result with this patch.
>
> If the time advances enough to overflow the unsigned mult, which is
> entirely possible as it takes just twice the time of the negative overflow,
> then time will go backwards again and that's not 'miss' or 'lose', that's
> just broken.

Eh? If you overflow the 64bits on the mult, the shift (which is likely
large if you're actually hitting the overflow) brings the value back
down to a smaller value. Time doesn't go backwards, its just smaller
then it ought to be (since the high bits were lost).

> If we want to prevent that, then we either have to clamp the delta value,
> which is the worst choice or use 128bit math to avoid the overflow.

I'm not convinced yet either of these approaches are really needed.

>> I'm not sure if its really worth trying to recover that time or be
>> perfect in those situations. Especially since on narrow clocksources
>> you'll have the same result.
>
> We can deal with the 64bit overflow at least for wide clocksources which
> all virtualizaton infected architectures provide in a sane way.

Another approach would be to push back on the virtualization
environments to step in and virtualize a solution if they've idled a
host for too long. They could do like the old tick-based
virtualization environments used to and trigger a few timer interrupts
while slowly removing a fake negative clocksource offset to allow time
to catch up more normally after a long stall.

Or they could require clocksources that have smaller shift values to
allow longer idle periods.

> For bare metal systems with narrow clocksources the whole issue is non
> existant we can make the 128bit math depend on both a config switch and a
> static key, so bare metal will not have to take the burden.

Bare metal machines also sometimes run virtualization. I'm not sure
the two are usefully exclusive.

thanks
-john