linux-kernel - Re: [clocksource] 388450c708: netperf.Throughput

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20210514174908.GI975577@paulmck-ThinkPad-P17-Gen-1>
Date:   Fri, 14 May 2021 10:49:08 -0700
From:   "Paul E. McKenney" <paulmck@...nel.org>
To:     Feng Tang <feng.tang@...el.com>
Cc:     kernel test robot <oliver.sang@...el.com>,
        0day robot <lkp@...el.com>,
        Thomas Gleixner <tglx@...utronix.de>,
        John Stultz <john.stultz@...aro.org>,
        Stephen Boyd <sboyd@...nel.org>,
        Jonathan Corbet <corbet@....net>,
        Mark Rutland <Mark.Rutland@....com>,
        Marc Zyngier <maz@...nel.org>, Andi Kleen <ak@...ux.intel.com>,
        Xing Zhengjun <zhengjun.xing@...ux.intel.com>,
        LKML <linux-kernel@...r.kernel.org>, lkp@...ts.01.org,
        ying.huang@...el.com, zhengjun.xing@...el.com, kernel-team@...com,
        neeraju@...eaurora.org
Subject: Re: [clocksource]  388450c708:  netperf.Throughput_tps -65.1%
 regression

On Fri, May 14, 2021 at 03:43:14PM +0800, Feng Tang wrote:
> Hi Paul,
> 
> On Thu, May 13, 2021 at 10:07:07AM -0700, Paul E. McKenney wrote:
> > On Thu, May 13, 2021 at 11:55:15PM +0800, kernel test robot wrote:
> > > 
> > > 
> > > Greeting,
> > > 
> > > FYI, we noticed a -65.1% regression of netperf.Throughput_tps due to commit:
> > > 
> > > 
> > > commit: 388450c7081ded73432e2b7148c1bb9a0b039963 ("[PATCH v12 clocksource 4/5] clocksource: Reduce clocksource-skew threshold for TSC")
> > > url: https://github.com/0day-ci/linux/commits/Paul-E-McKenney/Do-not-mark-clocks-unstable-due-to-delays-for-v5-13/20210501-083404
> > > base: https://git.kernel.org/cgit/linux/kernel/git/tip/tip.git 2d036dfa5f10df9782f5278fc591d79d283c1fad
> > > 
> > > in testcase: netperf
> > > on test machine: 96 threads 2 sockets Ice Lake with 256G memory
> > > with following parameters:
> > > 
> > > 	ip: ipv4
> > > 	runtime: 300s
> > > 	nr_threads: 25%
> > > 	cluster: cs-localhost
> > > 	test: UDP_RR
> > > 	cpufreq_governor: performance
> > > 	ucode: 0xb000280
> > > 
> > > test-description: Netperf is a benchmark that can be use to measure various aspect of networking performance.
> > > test-url: http://www.netperf.org/netperf/
> > > 
> > > 
> > > 
> > > If you fix the issue, kindly add following tag
> > > Reported-by: kernel test robot <oliver.sang@...el.com>
> > > 
> > > 
> > > also as Feng Tang checked, this is a "unstable clocksource" case.
> > > attached dmesg FYI.
> > 
> > Agreed, given the clock-skew event and the resulting switch to HPET,
> > performance regressions are expected behavior.
> > 
> > That dmesg output does demonstrate the value of Feng Tang's patch!
> > 
> > I don't see how to obtain the values of ->mult and ->shift that would
> > allow me to compute the delta.  So if you don't tell me otherwise, I
> > will assume that the skew itself was expected on this hardware, perhaps
> > somehow due to the tpm_tis_status warning immediately preceding the
> > clock-skew event.  If my assumption is incorrect, please let me know.
> 
> I run the case with the debug patch applied, the info is:
> 
> [   13.796429] clocksource: timekeeping watchdog on CPU19: Marking clocksource 'tsc' as unstable because the skew is too large:
> [   13.797413] clocksource:                       'hpet' wd_nesc: 505192062 wd_now: 10657158 wd_last: fac6f97 mask: ffffffff
> [   13.797413] clocksource:                       'tsc' cs_nsec: 504008008 cs_now: 3445570292aa5 cs_last: 344551f0cad6f mask: ffffffffffffffff
> [   13.797413] clocksource:                       'tsc' is current clocksource.
> [   13.797413] tsc: Marking TSC unstable due to clocksource watchdog
> [   13.844513] clocksource: Checking clocksource tsc synchronization from CPU 50 to CPUs 0-1,12,22,32-33,60,65.
> [   13.855080] clocksource: Switched to clocksource hpet
> 
> So the delta is 1184 us (505192062 - 504008008), and I agree with
> you that it should be related with the tpm_tis_status warning stuff.
> 
> But this re-trigger my old concerns, that if the margins calculated
> for tsc, hpet are too small?

If the error really did disturb either tsc or hpet, then we really
do not have a false positive, and nothing should change (aside from
perhaps documenting that TPM issues can disturb the clocks, or better
yet treating that perturbation as a separate bug that should be fixed).
But if this is yet another way to get a confused measurement, then it
would be better to work out a way to reject the confusion and keep the
tighter margins.  I cannot think right off of a way that this could
cause measurement confusion, but you never know.

So any thoughts on exactly how the tpm_tis_status warning might have
resulted in the skew?

> With current math algorithm, the 'uncertainty_margin' is
> calculated against the frequency, and those tsc/hpet/acpi_pm
> timer is multiple of MHz or GHz, which gives them to have margin of
> 100 us. It works with normal systems. But in the wild world, there
> could be some sparkles due to some immature HW components, their
> firmwares or drivers etc, just like this case. 

Isn't diagnosing issues from immature hardware, firmware, and drivers
actually a benefit?  It would after all be quite unfortunate if some issue
that was visible only due to clock skew were to escape into production.

							Thanx, Paul