lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Message-ID: <20241113223912.GD2331600@google.com>
Date: Wed, 13 Nov 2024 22:39:12 +0000
From: Joel Fernandes <joel@...lfernandes.org>
To: Frederic Weisbecker <frederic@...nel.org>
Cc: linux-kernel@...r.kernel.org,
	Anna-Maria Behnsen <anna-maria@...utronix.de>,
	Ingo Molnar <mingo@...nel.org>,
	Thomas Gleixner <tglx@...utronix.de>
Subject: Re: [RFC 3/3] tick-sched: Replace jiffie readout with idle_entrytime

On Mon, Nov 11, 2024 at 11:25:15PM +0100, Frederic Weisbecker wrote:
> Le Fri, Nov 08, 2024 at 05:48:36PM +0000, Joel Fernandes (Google) a écrit :
> > This solves the issue where jiffies can be stale and inaccurate.
> > 
> > Putting some prints, I see that basemono can be quite stale:
> > tick_nohz_next_event: basemono=18692000000 basemono_from_idle_entrytime=18695000000
> 
> That's 3 ms. If HZ < 1000 that's to be expected. But even with HZ = 1000 it can
> happen.

For me HZ=1000 though.

> > Since we have 'now' in ts->idle_entrytime, we can just use that. It is
> > more accurate, cleaner, reduces lines of code and reduces any lock
> > contention with the seq locks.
> 
> Do we need such accuracy? The timers rely on jiffies anyway.
> Also it's a seqcount read. Basically just a pair of smp_rmb().
> Not sure a division would be cheaper.

In low resolution mode, hrtimers are also expired by the tick. It seems to
me, because of the error in jiffies, the clockevent programming for the tick
will also have errors which reflect in hrtimer expiry (granted if one does
not want errors in hrtimer, they shouldn't be using low-res mode, but
still a bounded error is better than an unpredictable unbounded one..).

I agree on the division issue though, I couldn't find an alternative.

Expanding on the previous paragraph, it seems to me that we will end up
programming the clockevent with incorrect values that are subject to the
error. Like if basemono is lagging by several ticks, then 'expires' value in
clock event may also be lagging.  Then after the clock event is programmed
with the erroneous value, the clockevent will wake up the CPU too soon I
think causing power wastage. If the jifies was accurate, the clockevent would
wake up the system more into the future..

Or do you see this scenario not playing out?

> > I was also concerned about issue where jiffies is not updated for a long
> > time, and then we receive a non-tick interrupt in the future. Relying on
> > stale jiffies value and using that as base can be inaccurate to determine
> > whether next event occurs within next tick. Fix that.
> > 
> > XXX: Need to fix issue in idle accounting which does 'jiffies -
> > idle_entrytime'. If idle_entrytime is more current than jiffies, it
> > could cause negative values. I could replace jiffies with idle_exittime
> > in this computation potentially to fix that.
> > 
> > Signed-off-by: Joel Fernandes (Google) <joel@...lfernandes.org>
> > ---
> >  kernel/time/tick-sched.c | 27 +++++++--------------------
> >  1 file changed, 7 insertions(+), 20 deletions(-)
> > 
> > diff --git a/kernel/time/tick-sched.c b/kernel/time/tick-sched.c
> > index 4aa64266f2b0..22a4f96d9585 100644
> > --- a/kernel/time/tick-sched.c
> > +++ b/kernel/time/tick-sched.c
> > @@ -860,24 +860,6 @@ static inline bool local_timer_softirq_pending(void)
> >  	return local_softirq_pending() & BIT(TIMER_SOFTIRQ);
> >  }
> >  
> > -/*
> > - * Read jiffies and the time when jiffies were updated last
> > - */
> > -u64 get_jiffies_update(unsigned long *basej)
> > -{
> > -	unsigned long basejiff;
> > -	unsigned int seq;
> > -	u64 basemono;
> > -
> > -	do {
> > -		seq = read_seqcount_begin(&jiffies_seq);
> > -		basemono = last_jiffies_update;
> > -		basejiff = jiffies;
> > -	} while (read_seqcount_retry(&jiffies_seq, seq));
> > -	*basej = basejiff;
> > -	return basemono;
> > -}
> 
> This is used in tmigr as well.

Yeah, sorry I had fixed the issue in my next revision when I adjusted the
timer migration code:

https://git.kernel.org/pub/scm/linux/kernel/git/jfern/linux.git/commit/?h=timers/tick-sched&id=1cf33bddf50341cf9802ed19374dc42d8466868b

I am on work travel this week, apologies for slow replies and thanks so much
for your replies and your discussions!

thanks,

 - Joel


Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ