linux-kernel - Re: [REGRESSION] Re: [PATCH 00/24] Complete EEVDF

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <CAJvTdK=9VCqqJUE5UQPY3h0E8gjPH-rXaTWxP8FtrWRYAkb7mg@mail.gmail.com>
Date: Tue, 14 Jan 2025 20:08:56 -0600
From: Len Brown <lenb@...nel.org>
To: Doug Smythies <dsmythies@...us.net>
Cc: Peter Zijlstra <peterz@...radead.org>, linux-kernel@...r.kernel.org, 
	vincent.guittot@...aro.org, Ingo Molnar <mingo@...nel.org>, wuyun.abel@...edance.com
Subject: Re: [REGRESSION] Re: [PATCH 00/24] Complete EEVDF

Doug,
Your attention to detail and persistence has once again found a tricky
underlying bug -- kudos!

Re: turbostat behaviour

Yes, TSC_MHz -- "the measured rate of the TSC during an interval", is
printed as a sanity check.  If there are any irregularities in it, as
you noticed, then something very strange in the hardware or software
is going wrong (and the actual turbostat results will likely not be
reliable).

YTes, the "usec" column measures how long it takes to migrate to a CPU
and collect stats there.  So if you are hunting down a glitch in
migration all you need is this column to see it.  "usec" on the
summary row is the difference between the 1st migration and after the
last -- excluding the sysfs/procfs time that is consumed on the last
CPU.  So migration delays will also be reflected there.

Note: we have a patch queued which changes the "usec" on the Summary
row to *include* the sysfs/procfs time on the last CPU.  (The per-cpu
"usec" values are unchanged.)  This is because we've noticed some
really weird delays in doing things like reading /proc/interrupts and
we want to be able to easily do A/B comparisons by simply including or
excluding counters.

Also FYI, The scheme of migrating to each CPU so that collecting stats
there will be "local" isn't scaling so well on very large systems, and
I'm about to take a close look at it.  In yogini we used a different
scheme, where a thread is bound to each CPU, so they can collect in
parallel; and we may be moving to something like that.

cheers,
Len Brown, Intel Open Source Technology Center