[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Message-ID: <b225470d-6c32-0697-6a39-6b20246d4f90@redhat.com>
Date: Mon, 3 Dec 2018 16:05:04 -0500
From: Waiman Long <longman@...hat.com>
To: Qian Cai <cai@....us>, mark.rutland@....com, marc.zyngier@....com
Cc: daniel.lezcano@...aro.org, tglx@...utronix.de,
peterz@...radead.org, mingo@...nel.org,
linux-arm-kernel@...ts.infradead.org, linux-kernel@...r.kernel.org
Subject: Re: [PATCH] clocksource/arm_arch_timer: fix a lockdep warning
On 12/03/2018 03:31 PM, Qian Cai wrote:
> On Mon, 2018-12-03 at 15:07 -0500, Waiman Long wrote:
>> On 12/03/2018 02:33 PM, Qian Cai wrote:
>>> Booting this Huawei TaiShan 2280 arm64 server generated this lockdep
>>> warning.
>>>
>>> [ 0.000000] lockdep_assert_cpus_held+0x50/0x60
>>> [ 0.000000] static_key_enable_cpuslocked+0x30/0xe8
>>> [ 0.000000] arch_timer_check_ool_workaround+0x128/0x2d0
>>> [ 0.000000] arch_timer_acpi_init+0x274/0x6ac
>>> [ 0.000000] acpi_table_parse+0x1ac/0x218
>>> [ 0.000000] __acpi_probe_device_table+0x164/0x1ec
>>> [ 0.000000] timer_probe+0x1bc/0x254
>>> [ 0.000000] time_init+0x44/0x98
>>> [ 0.000000] start_kernel+0x4ec/0x7d4
>>>
>>> This is due to the commit cb538267ea1e ("jump_label/lockdep: Assert we hold
>>> the hotplug lock for _cpuslocked() operations"). Therefore, it will check
>>> if it is really in the CPU hotplug path or not, and work around this
>>> problem by using cpus_read_trylock(). The chance of not getting the read
>>> lock is very small. If that happens, it will report a lockdep warning at
>>> most.
>>>
>>> Signed-off-by: Qian Cai <cai@....us>
>>> ---
>>> drivers/clocksource/arm_arch_timer.c | 9 +++++++++
>>> 1 file changed, 9 insertions(+)
>>>
>>> diff --git a/drivers/clocksource/arm_arch_timer.c
>>> b/drivers/clocksource/arm_arch_timer.c
>>> index 9a7d4dc..5c9acbd 100644
>>> --- a/drivers/clocksource/arm_arch_timer.c
>>> +++ b/drivers/clocksource/arm_arch_timer.c
>>> @@ -497,11 +497,20 @@ void arch_timer_enable_workaround(const struct
>>> arch_timer_erratum_workaround *wa
>>> per_cpu(timer_unstable_counter_workaround, i) = wa;
>>> }
>>>
>>> +#ifdef CONFIG_HOTPLUG_CPU
>> If HOTPLUG_CPU isn't defined, all the cpus_lock() and related functions
>> are just no-op. You don't need to use conditional compilation directive
>> here.
> Make sense.
>
>>> + i = 0;
>>> +
>>> /*
>>> * Use the locked version, as we're called from the CPU
>>> * hotplug framework. Otherwise, we end-up in deadlock-land.
>>> */
>> I think the main problem is the above comment may not be true anymore or
>> is only occasionally true. We need to audit the code to find the root cause.
> This was a commit introduced in Aug. 2017, 450f9689f294
> (clocksource/arm_arch_timer: Use static_branch_enable_cpuslocked()) which
> basically drop the cpus_read_lock(). May I ask what changes made you think the
> above comment incorrect now?
>
If the above comment is true, you won't have the lockdep splat in the
first place. I think the most likely case is that the function can be
called from both a hotplug path and a non-hotplug path. So the comment
needs to be updated to reflect that.
>>> + i = cpus_read_trylock();
>>> static_branch_enable_cpuslocked(&arch_timer_read_ool_enabled);
>>> + if (i)
>>> + cpus_read_unlock();
>> This is not the right way of fixing the lockdep splash.
>>
> I should had said it is a workaround. I am all-ears for a proper way to fix
> this. When the above commit 450f9689f294 was merged, there was no cb538267ea1e
> so no lockdep warning.
As a workaround, you should better document that in the code. Thinking
about it some more, your workaround seems to be valid. Recursive
cpus_read_lock() is allowed. If it is called from a hotplug path,
cpus_write_lock() should have been taken and cpus_read_trylock() will
failed. Of course, we will have to assume that the current CPU is the
one that has taken the write lock in this case.
Cheers,
Longman
Powered by blists - more mailing lists