lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Message-ID: <DDD67E2B-F721-49AF-8889-8AD57ACDA190@akamai.com>
Date:   Sat, 19 Oct 2019 00:44:42 +0000
From:   "Zhivich, Michael" <mzhivich@...mai.com>
To:     Thomas Gleixner <tglx@...utronix.de>
CC:     "linux-kernel@...r.kernel.org" <linux-kernel@...r.kernel.org>,
        "mingo@...hat.com" <mingo@...hat.com>,
        "bp@...en8.de" <bp@...en8.de>, "hpa@...or.com" <hpa@...or.com>,
        "rafael.j.wysocki@...el.com" <rafael.j.wysocki@...el.com>
Subject: Re: [PATCH] clocksource: tsc: respect tsc bootparam for
 clocksource_tsc_early

On 10/18/19, 2:41 PM, "Thomas Gleixner" <tglx@...utronix.de> wrote:

> On Wed, 2 Oct 2019, Michael Zhivich wrote:
> > Introduction of clocksource_tsc_early broke functionality of "tsc=reliable"
> > and "tsc=nowatchdog" boot params, since clocksource_tsc_early is *always*
> > registered with CLOCK_SOURCE_MUST_VERIFY and thus put on the watchdog list.
> > 
> > If CPU is very busy during boot, the watchdog clocksource may not be
> > read frequently enough, resulting in a large difference that is treated as
> > "negative" by clocksource_delta() and incorrectly truncated to 0.
> 
> What? 
> 
> >   clocksource: timekeeping watchdog on CPU1: Marking clocksource
> >                'tsc-early' as unstable because the skew is too large:
> >   clocksource: 'refined-jiffies' wd_now: fffb7019 wd_last: fffb6e28
> 
>     0xfffb7019 - 0xfffb6e28 = 497
> 
> What's treated negative there? And what would truncate that to 0?
> 
> >                 mask: ffffffff
> 
> A 'negative delta' value can only happen when the clocksource is not read
> before it advanced more than mask/2. For jiffies this means 2^31
> ticks. That would be ~248 days for HZ=100 or ~24 days for HZ=1000.
> 
> >   clocksource: 'tsc-early' cs_now: 52c3918b89d6 cs_last: 52c31d736d2e
> 
>   0x52c3918b89d6 - 0x52c31d736d2e = 1.94774e+09
> 
> Again nothing is treated negative here, but the point is that the TSC
> advanced by ~2e9 cycles while jiffies advanced by 497.
> 
> How that's related, I can't tell because I don't know the TSC frequency of
> your machine. HZ is probably 1000 as the watchdog period is HZ/2.
> 
> >                 mask: ffffffffffffffff
> >   tsc: Marking TSC unstable due to clocksource watchdog
> 
> Even if the watchdog is not read out for a quite some time, the math in
> there and in clocksource_delta() can handle pretty large deltas.
> 
> The watchdog triggers when
> 
>     abs(delta_watchdog - delta_tsc) > 0.0625 seconds
> 
> So that has absolutely nothing to do with CPU being busy and watchdog not
> being serviced. jiffies and TSC drift apart for some reason, and that
> reason wants to be documented in the changelog.
> 
> That said, I have no objections against the patch itself, but I'm not going
> to apply a patch with a fairy tale changelog.
> 
> Thanks,
> 
> 	tglx
>     

Thanks for taking a look.  I agree that the commit message explanation is bogus; I had
incorrectly assumed that the issue was similar to what I've seen with acpi_pm based
watchdog before and didn't check the numbers carefully.

I'll rework the commit message and resubmit.

Thanks,
~ Michael

Download attachment "smime.p7s" of type "application/pkcs7-signature" (3491 bytes)

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ