linux-kernel - Re: [RESEND PATCH v6] x86/hpet: Reduce HPET counter read contention

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <57CEE4A8.4050907@hpe.com>
Date:   Tue, 6 Sep 2016 11:45:44 -0400
From:   Waiman Long <waiman.long@....com>
To:     Waiman Long <Waiman.Long@....com>
CC:     Thomas Gleixner <tglx@...utronix.de>,
        Ingo Molnar <mingo@...hat.com>,
        "H. Peter Anvin" <hpa@...or.com>, <linux-kernel@...r.kernel.org>,
        <x86@...nel.org>, Borislav Petkov <bp@...e.de>,
        Andy Lutomirski <luto@...nel.org>,
        Dave Hansen <dave.hansen@...el.com>,
        Prarit Bhargava <prarit@...hat.com>,
        Scott J Norton <scott.norton@....com>,
        Douglas Hatch <doug.hatch@....com>,
        Randy Wright <rwright@....com>
Subject: Re: [RESEND PATCH v6] x86/hpet: Reduce HPET counter read contention

On 09/06/2016 11:27 AM, Waiman Long wrote:
> On a large system with many CPUs, using HPET as the clock source can
> have a significant impact on the overall system performance because
> of the following reasons:
>   1) There is a single HPET counter shared by all the CPUs.
>   2) HPET counter reading is a very slow operation.
>
> Using HPET as the default clock source may happen when, for example,
> the TSC clock calibration exceeds the allowable tolerance. Something
> the performance slowdown can be so severe that the system may crash
> because of a NMI watchdog soft lockup, for example.
>
> During the TSC clock calibration process, the default clock source
> will be set temporarily to HPET. For systems with many CPUs, it is
> possible that NMI watchdog soft lockup may occur occasionally during
> that short time period where HPET clocking is active as is shown in
> the kernel log below:
>
> [   71.618132] NetLabel: Initializing
> [   71.621967] NetLabel:  domain hash size = 128
> [   71.626848] NetLabel:  protocols = UNLABELED CIPSOv4
> [   71.632418] NetLabel:  unlabeled traffic allowed by default
> [   71.638679] hpet0: at MMIO 0xfed00000, IRQs 2, 8, 0, 0, 0, 0, 0, 0
> [   71.646504] hpet0: 8 comparators, 64-bit 14.318180 MHz counter
> [   71.655313] Switching to clocksource hpet
> [   95.679135] BUG: soft lockup - CPU#144 stuck for 23s! [swapper/144:0]
> [   95.693363] BUG: soft lockup - CPU#145 stuck for 23s! [swapper/145:0]
> [   95.694203] Modules linked in:
> [   95.694697] CPU: 145 PID: 0 Comm: swapper/145 Not tainted 3.10.0-327.el7.x86_64 #1
> [   95.695580] BUG: soft lockup - CPU#582 stuck for 23s! [swapper/582:0]
> [   95.696145] Hardware name: HP Superdome2 16s x86, BIOS Bundle: 008.001.006 SFW: 041.063.152 01/16/2016
> [   95.698128] BUG: soft lockup - CPU#357 stuck for 23s! [swapper/357:0]
>
> This patch attempts to address the above issues by reducing HPET read
> contention using the fact that if more than one CPUs are trying to
> access HPET at the same time, it will be more efficient when only
> one CPU in the group reads the HPET counter and shares it with the
> rest of the group instead of each group member trying to read the
> HPET counter individually.
>
> This is done by using a combination word with a sequence number and
> a bit lock. The CPU that gets the bit lock will be responsible for
> reading the HPET counter and update the sequence number. The others
> will monitor the change in sequence number and grab the HPET counter
> value accordingly. This change is only enabled on SMP configuration.
>
> On a 4-socket Haswell-EX box with 144 threads (HT on), running the
> AIM7 compute workload (1500 users) on a 4.8-rc1 kernel (HZ=1000)
> with and without the patch has the following performance numbers
> (with HPET or TSC as clock source):
>
> TSC		= 1042431 jobs/min
> HPET w/o patch	=  798068 jobs/min
> HPET with patch	= 1029445 jobs/min
>
> The perf profile showed a reduction of the %CPU time consumed by
> read_hpet from 11.19% without patch to 1.24% with patch.
>
> Signed-off-by: Waiman Long<Waiman.Long@....com>

Will this patch be good enough to get merged for the next kernel 
version, probably 4.9? We have problem with our latest large SMP systems 
because of this issue. Even some new 4-socket systems had explicited 
this problem as noted by Prarit. We really need to get this fix upstream 
to have it merged into the distros.

Cheers,
Longman