[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <119b669e-aafb-4d73-e94e-ef119f909cfa@huawei.com>
Date: Fri, 30 Sep 2022 17:45:29 +0800
From: Yu Liao <liaoyu15@...wei.com>
To: Feng Tang <feng.tang@...el.com>,
Xiongfeng Wang <wangxiongfeng2@...wei.com>
CC: Zhang Rui <rui.zhang@...el.com>,
Thomas Gleixner <tglx@...utronix.de>,
Bjorn Helgaas <helgaas@...nel.org>,
Ingo Molnar <mingo@...hat.com>,
"Borislav Petkov" <bp@...en8.de>, <x86@...nel.org>,
<linux-kernel@...r.kernel.org>,
Bjorn Helgaas <bhelgaas@...gle.com>,
Kai-Heng Feng <kai.heng.feng@...onical.com>,
<len.brown@...el.com>, Xie XiuQi <xiexiuqi@...wei.com>,
Kefeng Wang <wangkefeng.wang@...wei.com>
Subject: Re: [PATCH] x86/PCI: Convert force_disable_hpet() to standard quirk
On 2022/9/30 9:15, Feng Tang wrote:
> On Fri, Sep 30, 2022 at 09:05:24AM +0800, Xiongfeng Wang wrote:
>>
>>
>> On 2022/9/30 8:38, Feng Tang wrote:
>>> On Thu, Sep 29, 2022 at 11:52:28PM +0800, Yu Liao wrote:
>>>> On 2020/12/2 15:28, Zhang Rui wrote:
>>>>> On Mon, 2020-11-30 at 20:21 +0100, Thomas Gleixner wrote:
>>>>>> Feng,
>>>>>>
>>>>>> On Fri, Nov 27 2020 at 14:11, Feng Tang wrote:
>>>>>>> On Fri, Nov 27, 2020 at 12:27:34AM +0100, Thomas Gleixner wrote:
>>>>>>>> On Thu, Nov 26 2020 at 09:24, Feng Tang wrote:
>>>>>>>> Yes, that can happen. But OTOH, we should start to think about
>>>>>>>> the
>>>>>>>> requirements for using the TSC watchdog.
>>>>>
>>>>> My original proposal is to disable jiffies and refined-jiffies as the
>>>>> clocksource watchdog, because they are not reliable and it's better to
>>>>> use clocksource that has a hardware counter as watchdog, like the patch
>>>>> below, which I didn't sent out for upstream.
>>>>>
>>>>> >From cf9ce0ecab8851a3745edcad92e072022af3dbd9 Mon Sep 17 00:00:00 2001
>>>>> From: Zhang Rui <rui.zhang@...el.com>
>>>>> Date: Fri, 19 Jun 2020 22:03:23 +0800
>>>>> Subject: [RFC PATCH] time/clocksource: do not use refined-jiffies as watchdog
>>>>>
>>>>> On IA platforms, if HPET is disabled, either via x86 early-quirks, or
>>>>> via kernel commandline, refined-jiffies will be used as clocksource
>>>>> watchdog in early boot phase, before acpi_pm timer registered.
>>>>>
>>>>> This is not a problem if jiffies are accurate.
>>>>> But in some cases, for example, when serial console is enabled, it may
>>>>> take several milliseconds to write to the console, with irq disabled,
>>>>> frequently. Thus many ticks may become longer than it should be.
>>>>>
>>>>> Using refined-jiffies as watchdog in this case breaks the system because
>>>>> a) duration calculated by refined-jiffies watchdog is always consistent
>>>>> with the watchdog timeout issued using add_timer(), say, around 500ms.
>>>>> b) duration calculated by the running clocksource, usually TSC on IA
>>>>> platforms, reflects the real time cost, which may be much larger.
>>>>> This results in the running clocksource being disabled erroneously.
>>>>>
>>>>> This is reproduced on ICL because HPET is disabled in x86 early-quirks,
>>>>> and also reproduced on a KBL and a WHL platform when HPET is disabled
>>>>> via command line.
>>>>>
>>>>> BTW, commit fd329f276eca
>>>>> ("x86/mtrr: Skip cache flushes on CPUs with cache self-snooping") is
>>>>> another example that refined-jiffies causes the same problem when ticks
>>>>> become slow for some other reason.
>>>>
>>>> Hi, Zhang Rui, we have met the same problem as you mentioned above. I have
>>>> tested the following modification. It can solve the problem. Do you have plan
>>>> to push it to upstream ?
>>>
>>> Hi Liao Yu,
>>>
>>> Could you provoide more details? Like, what ARCH is the platform (x86
>>> or others), client or sever, if sever, how many sockets (2S/4S/8S)?
>>>
>>> The error kernel log will also be helpful.
>>
>> Hi, Feng Tang,
>>
>> It's a X86 Sever. lscpu print the following information:
>>
>> Architecture: x86_64
>> CPU op-mode(s): 32-bit, 64-bit
>> Byte Order: Little Endian
>> Address sizes: 46 bits physical, 48 bits virtual
>> CPU(s): 224
>> On-line CPU(s) list: 0-223
>> Thread(s) per core: 2
>> Core(s) per socket: 28
>> Socket(s): 4
>> NUMA node(s): 4
>> Vendor ID: GenuineIntel
>> CPU family: 6
>> Model: 85
>> Model name: Intel(R) Xeon(R) Platinum 8180 CPU @ 2.50GHz
>> Stepping: 4
>> CPU MHz: 3199.379
>> CPU max MHz: 3800.0000
>> CPU min MHz: 1000.0000
>> BogoMIPS: 5000.00
>> Virtualization: VT-x
>> L1d cache: 3.5 MiB
>> L1i cache: 3.5 MiB
>> L2 cache: 112 MiB
>> L3 cache: 154 MiB
>> NUMA node0 CPU(s): 0-27,112-139
>> NUMA node1 CPU(s): 28-55,140-167
>> NUMA node2 CPU(s): 56-83,168-195
>> NUMA node3 CPU(s): 84-111,196-223
>>
>> Part of the kernel log is as follows.
>>
>> [ 1.144402] smp: Brought up 4 nodes, 224 CPUs
>> [ 1.144402] smpboot: Max logical packages: 4
>> [ 1.144402] smpboot: Total of 224 processors activated (1121097.93 BogoMIPS)
>> [ 1.520003] clocksource: timekeeping watchdog on CPU2: Marking clocksource
>> 'tsc-early' as unstable because the skew is too large:
>> [ 1.520010] clocksource: 'refined-jiffies' wd_now:
>> fffb7210 wd_last: fffb7018 mask: ffffffff
>> [ 1.520013] clocksource: 'tsc-early' cs_now:
>> 6606717afddd0 cs_last: 66065eff88ad4 mask: ffffffffffffffff
>> [ 1.520015] tsc: Marking TSC unstable due to clocksource watchdog
>> [ 5.164635] node 0 initialised, 98233092 pages in 4013ms
>> [ 5.209294] node 3 initialised, 98923232 pages in 4057ms
>> [ 5.220001] node 2 initialised, 99054870 pages in 4068ms
>> [ 5.222282] node 1 initialised, 99054870 pages in 4070ms
>
> Thanks Xiaofeng for the info.
>
> Could you try the below patch? It is kinda extension of
>
> b50db7095fe0 ("x86/tsc: Disable clocksource watchdog for TSC on qualified platorms")
>
> which I have run limited test on some 4 sockets Haswell and Cascadelake
> AP x86 servers.
>
>
> Thanks,
> Feng
> ---
> diff --git a/arch/x86/kernel/tsc.c b/arch/x86/kernel/tsc.c
> index cafacb2e58cc..b4ea79cb1d1a 100644
> --- a/arch/x86/kernel/tsc.c
> +++ b/arch/x86/kernel/tsc.c
> @@ -1217,7 +1217,7 @@ static void __init check_system_tsc_reliable(void)
> if (boot_cpu_has(X86_FEATURE_CONSTANT_TSC) &&
> boot_cpu_has(X86_FEATURE_NONSTOP_TSC) &&
> boot_cpu_has(X86_FEATURE_TSC_ADJUST) &&
> - nr_online_nodes <= 2)
> + nr_online_nodes <= 8)
> tsc_disable_clocksource_watchdog();
> }
>
>
Hi Feng,
I tested this patch on a previous server and it fixes the issue.
Thanks,
Yu
Powered by blists - more mailing lists