[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <CALAqxLVYpYUXzP9yFnR0px3LCK9ktWnCtOcquvhwR0r0jo9B6g@mail.gmail.com>
Date: Tue, 15 Nov 2016 17:10:53 -0800
From: John Stultz <john.stultz@...aro.org>
To: Thomas Gleixner <tglx@...utronix.de>
Cc: Chris Metcalf <cmetcalf@...lanox.com>,
Laurent Vivier <lvivier@...hat.com>,
David Gibson <david@...son.dropbear.id.au>,
"Christopher S . Hall" <christopher.s.hall@...el.com>,
lkml <linux-kernel@...r.kernel.org>,
Liav Rehana <liavr@...lanox.com>
Subject: Re: [PATCH] time: Avoid signed overflow in timekeeping_delta_to_ns()
On Tue, Nov 15, 2016 at 2:03 PM, John Stultz <john.stultz@...aro.org> wrote:
> On Tue, Nov 15, 2016 at 1:53 PM, Thomas Gleixner <tglx@...utronix.de> wrote:
>> On Mon, 14 Nov 2016, John Stultz wrote:
>>
>>> On Mon, Nov 14, 2016 at 11:42 AM, Chris Metcalf <cmetcalf@...lanox.com> wrote:
>>> > This bugfix was originally made in commit 35a4933a8959 ("time:
>>> > Avoid signed overflow in timekeeping_get_ns()"). When the code was
>>> > refactored in commit 6bd58f09e1d8 ("time: Add cycles to nanoseconds
>>> > translation") the signed overflow fix was lost. Re-introduce it.
>>> >
>>> > Signed-off-by: Chris Metcalf <cmetcalf@...lanox.com>
>>> > ---
>>> > I happened to be looking for an unrelated fix, found this code,
>>> > realized the tip code didn't match the fixed code, and
>>> > backtracked to where it had gone away.
>>> >
>>> > kernel/time/timekeeping.c | 3 +--
>>> > 1 file changed, 1 insertion(+), 2 deletions(-)
>>> >
>>> > diff --git a/kernel/time/timekeeping.c b/kernel/time/timekeeping.c
>>> > index 37dec7e3db43..57926bc7b7f3 100644
>>> > --- a/kernel/time/timekeeping.c
>>> > +++ b/kernel/time/timekeeping.c
>>> > @@ -304,8 +304,7 @@ static inline s64 timekeeping_delta_to_ns(struct tk_read_base *tkr,
>>> > {
>>> > s64 nsec;
>>> >
>>> > - nsec = delta * tkr->mult + tkr->xtime_nsec;
>>> > - nsec >>= tkr->shift;
>>> > + nsec = (delta * tkr->mult + tkr->xtime_nsec) >> tkr->shift;
>>>
>>> Ugh.
>>>
>>> So... I think this proves the original fix was *far* too subtle to
>>> maintain. So I think reintroducing it as-is doesn't protect us from
>>> undoing it. If the problem is really using the signed intermediate
>>> nsec value, we should get rid of that.
>>
>> As I told the other guy who submitted something similar: This is not really
>> helpful. It merily drags the overflow case out by a factor of 2.
>
> Well... So lost time (where a VM/gdb caused stall runs past the
> clocksource or causes an mult overflow) is a bit less problematic then
> getting a huge negative nsec value.
>
>> So we really need to figure out under which circumstances this can happen
>> and fixup either the callsites or detect the condition right there, which
>> I'd like to avoid for the hotpath.
>
> I get that catching the (delta > TOOBIG) case, but even then I'm not
> sure how we deal that condition in a way that results in anything
> meaningfully different from the less-problematic unsigned overflow
> (ie, capping it).
So I think I'm going to queue up Liav's fix here, as it has been in my
TOQUEUE folder for a bit longer.
Thomas: I know you didn't like it when it was originally submitted,
preferring to catch the case when it happens, but the signed shift is
more problematic. Additionally, the CONFIG_DEBUG_TIMEKEEPING checks
should already warn on the next tick when this case triggers (when the
offset is larger then max_cycles).
Sound ok?
thanks
-john
Powered by blists - more mailing lists