[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <8ffea578-5eb6-f479-7bd4-668df84d930f@oracle.com>
Date: Mon, 18 Feb 2019 11:48:37 +0800
From: Zhenzhong Duan <zhenzhong.duan@...cle.com>
To: Thomas Gleixner <tglx@...utronix.de>
Cc: linux-kernel@...r.kernel.org, x86@...nel.org,
Daniel Lezcano <daniel.lezcano@...aro.org>,
Waiman Long <longman@...hat.com>,
Srinivas Eeda <srinivas.eeda@...cle.com>, kin.cho@...cle.com
Subject: Re: [PATCH] acpi_pm: Reduce PMTMR counter read contention
On 2019/2/11 5:08, Thomas Gleixner wrote:
> On Sat, 2 Feb 2019, Zhenzhong Duan wrote:
>> On 2019/1/31 22:26, Thomas Gleixner wrote:
>
>>>>> I'm not against the change per se, but I really want to understand
>>>>> why we need all the complexity for something which should never be
>>>>> used in a real world deployment.
>>>>>
>>>> Hmm, it's a strong word of "never be used". Customers may happen to
>>>> use nohpet(sanity test?) and report bug to us. Sometimes they does
>>>> report a bug that reproduce with their customed config. There may
>>>> also be BIOS setting HPET disabled.
>>>
>>> And because the customer MAY do completely nonsensical things (and there
>>> are a lot more than the HPET) the kernel has to handle all of them?
>>
>> Ok, then. I don't have more suggestion to convince you.
>
> You give up too fast :)
Ah, because I thought of a simple fix.
>
> The point is, that we really want proper justifications for changes like
> this. Some 'may, could and more handwaving' simply does not cut it.
>
> So if you can just describe a realistic scenario, which does not involve
> thoughtless flipping of BIOS options, then this becomes way more palatable.
I indeed don't see a realistic scenario in a product env needing to use
nohpet. My only justification is now that we have nohpet as kernel
parameter, we should fix the softlockup in large machines for enterprise
use.
>
>> I just think of a simple fix as below. I think it will work for both hpet
>> and pmtmr. We will test it when the env is available.
>
>> --- a/kernel/time/timekeeping.c
>> +++ b/kernel/time/timekeeping.c
>> @@ -1353,6 +1353,7 @@ static int change_clocksource(void *data)
>>
>> write_seqcount_end(&tk_core.seq);
>> raw_spin_unlock_irqrestore(&timekeeper_lock, flags);
>> + tick_clock_notify();
>>
>> return 0;
>> }
>> @@ -1371,7 +1372,6 @@ int timekeeping_notify(struct clocksource *clock)
>> if (tk->tkr_mono.clock == clock)
>> return 0;
>> stop_machine(change_clocksource, clock, NULL);
>> - tick_clock_notify();
>> return tk->tkr_mono.clock == clock ? 0 : -1;
>> }
>
> This won't resolve the concurrency issues of HPET or PMTIMER in any
> way.
Just got chance to test and Kin confirmed it fix the softlockup of
PMTMR(with nohpet) and HPET(without nohpet, revert previous hpet commit)
at bootup stage.
My understandig is, at bootup stage tick device is firstly initialized
in periodic mode and then switch to one-shot mode. In periodic mode
clock event interrupt is triggered every 1ms(HZ=1000), contention in
HPET or PMTIMER exceeds 1ms and delayed the clock interrupt. Then CPUs
continue to process interrupt one by one without a break,
tick_clock_notify() have no chance to be called and we never switch to
one-shot mode.
In one-shot mode, the contention is still there but next event is always
set with a future value. We may missed some ticks, but the timer code is
smart enough to pick up those missed ticks.
By moving tick_clock_notify() in stop_machine, kernel changes to
one-shot mode early before the contention accumulate and lockup system.
> Instead it breaks the careful orchestrated mechanism of clocksource
> change.
Sorry, I didn't get a idea how it breaks, tick_clock_notify() is a
simple function setting bitmask in percpu variable. Could you explain a bit?
Thanks
Zhenzhong
Powered by blists - more mailing lists