lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Date:   Wed, 24 May 2017 09:16:28 +0200
From:   Ingo Molnar <mingo@...nel.org>
To:     Frederic Weisbecker <fweisbec@...il.com>
Cc:     LKML <linux-kernel@...r.kernel.org>,
        Peter Zijlstra <peterz@...radead.org>,
        Rik van Riel <riel@...hat.com>,
        James Hartsock <hartsjc@...hat.com>,
        Thomas Gleixner <tglx@...utronix.de>, stable@...r.kernel.org,
        Tim Wright <tim@...bash.co.uk>, Pavel Machek <pavel@....cz>
Subject: Re: [PATCH 0/2] nohz: Deal with clock reprogram skipping issues v3


* Frederic Weisbecker <fweisbec@...il.com> wrote:

> On Tue, May 23, 2017 at 09:25:08AM +0200, Ingo Molnar wrote:
> > 
> > * Frederic Weisbecker <fweisbec@...il.com> wrote:
> > 
> > > v2 had issues on -tip tree and triggered a warning. It seems to have
> > > disappeared. Perhaps it was due to another timer issue. Anyway this
> > > version brings more debugging informations, with a layout that is more
> > > bisection-friendly and it also handles ticks that fire outside IRQ
> > > context and thus carry NULL irq regs. This happen when
> > > hrtimer_interrupt() is called on hotplug cpu down for example.
> > > 
> > > We'll see if the issue arises again.
> > > 
> > > git://git.kernel.org/pub/scm/linux/kernel/git/frederic/linux-dynticks.git
> > > 	nohz/fixes
> > > 
> > > HEAD: cd15f46b284f04dbedd065a9d99a4e0badae379a
> > > 
> > > Thanks,
> > > 	Frederic
> > > ---
> > > 
> > > Frederic Weisbecker (2):
> > >       nohz: Add hrtimer sanity check
> > >       nohz: Fix collision between tick and other hrtimers, again
> > > 
> > > 
> > >  kernel/time/tick-sched.c | 48 +++++++++++++++++++++++++++++++++++++++++++-----
> > >  kernel/time/tick-sched.h |  2 ++
> > >  2 files changed, 45 insertions(+), 5 deletions(-)
> > 
> > So I think the 3 commits queued up right now:
> > 
> >  99fa871820cf: nohz: Reset next_tick cache even when the timer has no regs
> >  411fe24e6b7c: nohz: Fix collision between tick and other hrtimers, again
> >  ce6cf9a15d62: nohz: Add hrtimer sanity check
> > 
> > are OK and I'd not rebase them unless there's some breakage.
> > 
> > One thing I noticed: your second series does appear to have:
> > 
> >  99fa871820cf: nohz: Reset next_tick cache even when the timer has no regs
> > 
> > is that intentional? That is pretty much the only commit I'd love to rebase with a 
> > proper description added.
> 
> Yes in my latest series I melted "nohz: Reset next_tick cache even when the timer has no regs"
> into "nohz: Fix collision between tick and other hrtimers, again" because it's a fixup and
> keeping that patch separate may break bisection.
> 
> So ideally, it would be nice if you could fixup 411fe24e6b7c with 99fa871820cf. That's roughly
> all I did in my latest series.

So the interdiff between your two patches and the 3 commits already queued up is:

diff --git a/kernel/time/tick-sched.c b/kernel/time/tick-sched.c
index e3043873fcdc..30253ed0380b 100644
--- a/kernel/time/tick-sched.c
+++ b/kernel/time/tick-sched.c
@@ -150,12 +150,6 @@ static void tick_sched_handle(struct tick_sched *ts, struct pt_regs *regs)
 		touch_softlockup_watchdog_sched();
 		if (is_idle_task(current))
 			ts->idle_jiffies++;
-		/*
-		 * In case the current tick fired too early past its expected
-		 * expiration, make sure we don't bypass the next clock reprogramming
-		 * to the same deadline.
-		 */
-		ts->next_tick = 0;
 	}
 #endif
 	update_process_times(user_mode(regs));
@@ -1103,8 +1097,15 @@ static void tick_nohz_handler(struct clock_event_device *dev)
 	tick_sched_handle(ts, regs);
 
 	/* No need to reprogram if we are running tickless  */
-	if (unlikely(ts->tick_stopped))
+	if (unlikely(ts->tick_stopped)) {
+		/*
+		 * In case the current tick fired too early past its expected
+		 * expiration, make sure we don't bypass the next clock reprogramming
+		 * to the same deadline.
+		 */
+		ts->next_tick = 0;
 		return;
+	}
 
 	hrtimer_forward(&ts->sched_timer, now, tick_period);
 	tick_program_event(hrtimer_get_expires(&ts->sched_timer), 1);
@@ -1202,12 +1203,17 @@ static enum hrtimer_restart tick_sched_timer(struct hrtimer *timer)
 	 */
 	if (regs)
 		tick_sched_handle(ts, regs);
-	else
-		ts->next_tick = 0;
 
 	/* No need to reprogram if we are in idle or full dynticks mode */
-	if (unlikely(ts->tick_stopped))
+	if (unlikely(ts->tick_stopped)) {
+		/*
+		 * In case the current tick fired too early past its expected
+		 * expiration, make sure we don't bypass the next clock reprogramming
+		 * to the same deadline.
+		 */
+		ts->next_tick = 0;
 		return HRTIMER_NORESTART;
+	}
 
 	hrtimer_forward(timer, now, tick_period);
 

... so the two are not the same - I'd rather not rebase it, I'd like to keep what 
is working, we had problems with these changes before ...

If you'd like the changes in this interdiff to be applied as well, please add a 
changelog to it and post it as a fourth patch.

Thanks,

	Ingo

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ