lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Message-ID: <20260122144045.38254A3e-hca@linux.ibm.com>
Date: Thu, 22 Jan 2026 15:40:45 +0100
From: Heiko Carstens <hca@...ux.ibm.com>
To: Frederic Weisbecker <frederic@...nel.org>
Cc: LKML <linux-kernel@...r.kernel.org>,
        "Christophe Leroy (CS GROUP)" <chleroy@...nel.org>,
        "Rafael J. Wysocki" <rafael@...nel.org>,
        Alexander Gordeev <agordeev@...ux.ibm.com>,
        Anna-Maria Behnsen <anna-maria@...utronix.de>,
        Ben Segall <bsegall@...gle.com>, Boqun Feng <boqun.feng@...il.com>,
        Christian Borntraeger <borntraeger@...ux.ibm.com>,
        Dietmar Eggemann <dietmar.eggemann@....com>,
        Ingo Molnar <mingo@...hat.com>, Jan Kiszka <jan.kiszka@...mens.com>,
        Joel Fernandes <joelagnelf@...dia.com>,
        Juri Lelli <juri.lelli@...hat.com>,
        Kieran Bingham <kbingham@...nel.org>,
        Madhavan Srinivasan <maddy@...ux.ibm.com>,
        Mel Gorman <mgorman@...e.de>, Michael Ellerman <mpe@...erman.id.au>,
        Neeraj Upadhyay <neeraj.upadhyay@...nel.org>,
        Nicholas Piggin <npiggin@...il.com>,
        "Paul E . McKenney" <paulmck@...nel.org>,
        Peter Zijlstra <peterz@...radead.org>,
        Steven Rostedt <rostedt@...dmis.org>,
        Sven Schnelle <svens@...ux.ibm.com>,
        Thomas Gleixner <tglx@...utronix.de>,
        Uladzislau Rezki <urezki@...il.com>,
        Valentin Schneider <vschneid@...hat.com>,
        Vasily Gorbik <gor@...ux.ibm.com>,
        Vincent Guittot <vincent.guittot@...aro.org>,
        Viresh Kumar <viresh.kumar@...aro.org>,
        Xin Zhao <jackzxcui1989@....com>, linux-pm@...r.kernel.org,
        linux-s390@...r.kernel.org, linuxppc-dev@...ts.ozlabs.org
Subject: Re: [PATCH 05/15] s390/time: Prepare to stop elapsing in
 dynticks-idle

On Wed, Jan 21, 2026 at 07:04:35PM +0100, Frederic Weisbecker wrote:
> BTW here is a question for you, does the timer (as in get_cpu_timer()) still
> decrements while in idle? I would assume not, given how lc->system_timer
> is updated in account_idle_time_irq().

It is not decremented while in idle (or when the hypervisor schedules
the virtual cpu away). We use the fact that the cpu timer is not
decremented when the virtual cpu is not running vs the real
time-of-day clock to calculate steal time.

> And another question in this same function is this :
> 
>     lc->steal_timer += idle->clock_idle_enter - lc->last_update_clock;
> 
> clock_idle_enter is updated right before halting the CPU. But when was
> last_update_clock updated last? Could be either task switch to idle, or
> a previous idle tick interrupt or a previous idle IRQ entry. In any case
> I'm not sure the difference is meaningful as steal time.
> 
> I must be missing something.

"It has been like that forever" :) However I do agree that this doesn't seem
to make any sense. At least with the current implementation I cannot see how
that makes sense, since the difference of two time stamps, which do not
include any steal time are added.

Maybe it broke by some of all the changes over the years, or it was always
wrong, or I am missing something too.

Will investigate and address it if required. Thank you for bringing this up!

> > Not sure what to do with this. I thought about removing those sysfs files
> > already in the past, since they are of very limited use; and most likely
> > nothing in user space would miss them.
> 
> Perhaps but this file is a good comparison point against /proc/stat because
> s390 vtime is much closer to measuring the actual CPU halted time than what
> the generic nohz accounting does (which includes more idle code execution).

Yes, while comparing those files I also see an unexpected difference of
several seconds after two days of uptime; that is before your changes.

In theory the sum of idle and iowait in /proc/stat should be the same like the
per-cpu idle_time_us sysfs file. But there is a difference, which shouldn't be
there as far as I can tell. Yet another thing to look into.

> > Guess I need to spend some more time on accounting and see what it would take
> > to convert to VIRT_CPU_ACCOUNTING_GEN, while keeping the current precision and
> > functionality.
> 
> I would expect more overhead with VIRT_CPU_ACCOUNTING_GEN, though that has yet
> to be measured. In any case you'll lose some idle cputime precision (but
> you need to read that through s390 sysfs files) if what we want to measure
> here is the actual halted time.
> 
> Perhaps we could enhance VIRT_CPU_ACCOUNTING_GEN and nohz idle cputime
> accounting to match s390 precision. Though I expect some cost
> accessing the clock inevitably more often on some machines.

Let me experiment with that, but first I want to understand the oddities
pointed out above.

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ