[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20150605075734.GA25340@gmail.com>
Date: Fri, 5 Jun 2015 09:57:34 +0200
From: Ingo Molnar <mingo@...nel.org>
To: John Stultz <john.stultz@...aro.org>
Cc: Jeremiah Mahler <jmmahler@...il.com>,
Thomas Gleixner <tglx@...utronix.de>,
Preeti U Murthy <preeti@...ux.vnet.ibm.com>,
Peter Zijlstra <peterz@...radead.org>,
Viresh Kumar <viresh.kumar@...aro.org>,
Marcelo Tosatti <mtosatti@...hat.com>,
Frederic Weisbecker <fweisbec@...il.com>,
lkml <linux-kernel@...r.kernel.org>
Subject: Re: [BUG, bisect] hrtimer: severe lag after suspend & resume
* John Stultz <john.stultz@...aro.org> wrote:
> So I suspect the problem is the change to clock_was_set_seq in
> timekeeping_update is done prior to mirroring the time state to the
> shadow-timekeeper. Thus the next time we do update_wall_time() the updated
> sequence is overwritten by whats in the shadow copy. The attached patch moving
> the modification up seems to avoid the issue for me.
>
> Thomas: Looking at the problematic change, I'm not a big fan of it. Caching
> timekeeping state here in the hrtimer code has been a source of bugs in the
> past, and I'm not sure I see how avoiding copying 24bytes is that big of a win.
> Especially since it adds more state to the timekeeper and hrtimer base that we
> have to read and mange. Personally I'd prefer a revert to my fix.
So it's not really the copying of the 24 bytes that is the problem with that
pattern.
It's the constant dirtying of a cacheline that would otherwise be read-mostly, and
then the reading of it from the percpu hrtimer interrupts.
That can have negative scalability effects similar to a global lock.
( Now I have not checked whether this cacheline truly becomes read-mostly after
the original commit - I suspect Thomas did. )
> + if (action & TK_CLOCK_WAS_SET)
> + tk->clock_was_set_seq++;
> +
> tk_update_ktime_data(tk);
So I'd also add a comment that this update should be done before:
if (action & TK_MIRROR)
memcpy(&shadow_timekeeper, &tk_core.timekeeper,
sizeof(tk_core.timekeeper));
Also, there appears to be a layering violation here (unless I mis-read the
tk_real/tk logic): this should copy 'tk', not access tk_core directly, correct?
Right now this does not matter, because 'tk == &tk_core' is always supposed to be
true, or at least it should point to an identical shadow copy, right?
But nevertheless tk_core should only ever be directly referenced when 'tk' is
initialized from it.
Also, there's a few more of these apparent layering violations:
git grep 'tk_core' kernel/time/timekeeping.c | grep -v 'struct timekeeper'
I realize that we are transitioning from former global variables based timekeeping
to a more parametric model, so this isn't a complaint, just an observation.
Right now it's a code cleanliness detail, but if/when we introduce clocks that are
updated independently from each other then it will also matter functionally.
Thanks,
Ingo
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
Powered by blists - more mailing lists