lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite for Android: free password hash cracker in your pocket
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <CALAqxLUAXDHimaqA0mqUNcY0inOZGRf92SxSbX8dDMzJUBRvmQ@mail.gmail.com>
Date:	Thu, 4 Jun 2015 15:54:35 -0700
From:	John Stultz <john.stultz@...aro.org>
To:	Jeremiah Mahler <jmmahler@...il.com>,
	Thomas Gleixner <tglx@...utronix.de>,
	Preeti U Murthy <preeti@...ux.vnet.ibm.com>,
	Peter Zijlstra <peterz@...radead.org>,
	Viresh Kumar <viresh.kumar@...aro.org>,
	Marcelo Tosatti <mtosatti@...hat.com>,
	Frederic Weisbecker <fweisbec@...il.com>,
	John Stultz <john.stultz@...aro.org>,
	lkml <linux-kernel@...r.kernel.org>,
	Ingo Molnar <mingo@...nel.org>
Subject: Re: [BUG, bisect] hrtimer: severe lag after suspend & resume

On Wed, Jun 3, 2015 at 5:56 PM, Jeremiah Mahler <jmmahler@...il.com> wrote:
> all,
>
> After a fresh boot, the Chrome web browser behaves normally.  Pages
> load quickly and scroll fast.  Even image heavy sites such as
> images.google.com work fine.  However, after a suspend and resume
> cycle, Chrome becomes very slow.  Pages take ten seconds or more to
> load.  The scroll bars and buttons are almost completely
> unresponsive.  Interestingly, I can run Firefox on the same sites
> and it has no issue whatsoever.
>
> I have bisected the kernel and found that the following commit
> introduced the bug.  It is present in the latest linux-next (20150602).
>
>   From 868a3e915f7f5eba8f8cb4f7da2276760807c51c Mon Sep 17 00:00:00 2001
>   From: Thomas Gleixner <tglx@...utronix.de>
>   Date: Tue, 14 Apr 2015 21:08:37 +0000
>   Subject: [PATCH] hrtimer: Make offset update smarter
>
>   On every tick/hrtimer interrupt we update the offset variables of the
>   clock bases. That's silly because these offsets change very seldom.
>
>   Add a sequence counter to the time keeping code which keeps track of
>   the offset updates (clock_was_set()). Have a sequence cache in the
>   hrtimer cpu bases to evaluate whether the offsets must be updated or
>   not. This allows us later to avoid pointless cacheline pollution.
>
>   Signed-off-by: Thomas Gleixner <tglx@...utronix.de>
>   Reviewed-by: Preeti U Murthy <preeti@...ux.vnet.ibm.com>
>   Acked-by: Peter Zijlstra <peterz@...radead.org>
>   Cc: Viresh Kumar <viresh.kumar@...aro.org>
>   Cc: Marcelo Tosatti <mtosatti@...hat.com>
>   Cc: Frederic Weisbecker <fweisbec@...il.com>
>   Cc: John Stultz <john.stultz@...aro.org>
>   Link: http://lkml.kernel.org/r/20150414203501.132820245@linutronix.de
>   Signed-off-by: Thomas Gleixner <tglx@...utronix.de>
>   Cc: John Stultz <john.stultz@...aro.org>
>   ---
>    include/linux/hrtimer.h             |  4 ++--
>    include/linux/timekeeper_internal.h |  2 ++
>    kernel/time/hrtimer.c               |  3 ++-
>    kernel/time/timekeeping.c           | 23 ++++++++++++++++-------
>    kernel/time/timekeeping.h           |  7 ++++---
>    5 files changed, 26 insertions(+), 13 deletions(-)


So I suspect the problem is the change to clock_was_set_seq in
timekeeping_update is done prior to mirroring the time state to the
shadow-timekeeper. Thus the next time we do update_wall_time() the
updated sequence is overwritten by whats in the shadow copy. The
attached patch moving the modification up seems to avoid the issue for
me.

Thomas:  Looking at the problematic change, I'm not a big fan of it.
Caching timekeeping state here in the hrtimer code has been a source
of bugs in the past, and I'm not sure I see how avoiding copying
24bytes is that big of a win. Especially since it adds more state to
the timekeeper and hrtimer base that we have to read and mange.
Personally I'd prefer a revert to my fix.

thanks
-john

View attachment "fixup.patch" of type "text/x-patch" (675 bytes)

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ