lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <871pzgo77j.ffs@tglx>
Date: Tue, 12 Nov 2024 15:30:24 +0100
From: Thomas Gleixner <tglx@...utronix.de>
To: "Joel Fernandes (Google)" <joel@...lfernandes.org>,
 linux-kernel@...r.kernel.org, Anna-Maria Behnsen
 <anna-maria@...utronix.de>, Frederic Weisbecker <frederic@...nel.org>,
 Ingo Molnar <mingo@...nel.org>
Cc: "Joel Fernandes (Google)" <joel@...lfernandes.org>
Subject: Re: [RFC 3/3] tick-sched: Replace jiffie readout with idle_entrytime

On Fri, Nov 08 2024 at 17:48, Joel Fernandes wrote:
> This solves the issue where jiffies can be stale and inaccurate.

Which issue?

> Putting some prints, I see that basemono can be quite stale:
> tick_nohz_next_event: basemono=18692000000 basemono_from_idle_entrytime=18695000000

What is your definition of stale? 3ms on a system with HZ < 1000 is
completely correct and within the margin of the next tick, no?

> Since we have 'now' in ts->idle_entrytime, we can just use that. It is
> more accurate, cleaner, reduces lines of code and reduces any lock
> contention with the seq locks.

What's more accurate and what is the actual problem you are trying to
solve. This handwaving about cleaner, less lines of code and contention
on a non existing lock is just not helpful.

> I was also concerned about issue where jiffies is not updated for a long
> time, and then we receive a non-tick interrupt in the future. Relying on
> stale jiffies value and using that as base can be inaccurate to determine
> whether next event occurs within next tick. Fix that.

I'm failing to decode this word salad.

> XXX: Need to fix issue in idle accounting which does 'jiffies -
> idle_entrytime'. If idle_entrytime is more current than jiffies, it
> could cause negative values. I could replace jiffies with idle_exittime
> in this computation potentially to fix that.

So you "fix" some yet to be correctly described issue by breaking stuff?

>  static ktime_t tick_nohz_next_event(struct tick_sched *ts, int cpu)
>  {
> -	u64 basemono, next_tick, delta, expires, delta_hr, next_hr_wo;
> +	u64 basemono, next_tick, delta, expires, delta_hr, next_hr_wo, boot_ticks;
>  	unsigned long basejiff;
>  	int tick_cpu;
>  
> -	basemono = get_jiffies_update(&basejiff);
> +	boot_ticks = DIV_ROUND_DOWN_ULL(ts->idle_entrytime, TICK_NSEC);

Again this div/mult is more expensive than the sequence count on 32bit.

> -/*
> - * Read jiffies and the time when jiffies were updated last
> - */
> -u64 get_jiffies_update(unsigned long *basej)

How does this even compile? This function is global for a reason.

Thanks,

        tglx

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ