[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <alpine.LRH.2.21.1710162214420.15706@linux.fjfi.cvut.cz>
Date: Mon, 16 Oct 2017 22:41:26 +0200 (CEST)
From: David Kozub <zub@...ux.fjfi.cvut.cz>
To: Thomas Gleixner <tglx@...utronix.de>
cc: Daniel Lezcano <daniel.lezcano@...aro.org>,
linux-kernel@...r.kernel.org
Subject: Re: PROBLEM: Kernel BUG in mfgpt_tick (cs5535-clockevt.c) on ALIX
2c3 - null call
On Mon, 16 Oct 2017, Thomas Gleixner wrote:
> On Thu, 12 Oct 2017, David Kozub wrote:
>> On Thu, 12 Oct 2017, Thomas Gleixner wrote:
>>> On Thu, 12 Oct 2017, Daniel Lezcano wrote:
>>> The real question is why
>>>
>>> /* Set the clock scale and enable the event mode for CMP2 */
>>> val = MFGPT_SCALE | (3 << 8);
>>>
>>> cs5535_mfgpt_write(cs5535_event_clock, MFGPT_REG_SETUP, val);
>>>
>>> is in the setup code at all.
>>>
>>> The obvious place for this is in mfgpt_set_periodic() which gets called
>>> when the clock event and the handler is set up in the core code. Up to that
>>> point the interrupt handler is protected against shared interrupts via the
>>> is_shutdown() check.
>>
>> It is, but only after clockevents_config_and_register. But mfgpt_tick is
>> called before that (just after setup_irq).
>
> Right, but the protection against the spurious interrupt there is not
> sufficient. See patch below.
>
> Thanks,
>
> tglx
>
> 8<-------------------
>
> --- a/drivers/clocksource/cs5535-clockevt.c
> +++ b/drivers/clocksource/cs5535-clockevt.c
> @@ -117,7 +117,8 @@ static irqreturn_t mfgpt_tick(int irq, void *dev_id)
> /* Turn off the clock (and clear the event) */
> disable_timer(cs5535_event_clock);
>
> - if (clockevent_state_shutdown(&cs5535_clockevent))
> + if (clockevent_state_detached(&cs5535_clockevent) ||
> + clockevent_state_shutdown(&cs5535_clockevent))
> return IRQ_HANDLED;
>
> /* Clear the counter */
Hi Thomas,
Thanks for the patch. I can confirm that this resolves the issue.
I still think the fact that the handler is triggered at all is a bit
strange but maybe ensuring the handler doesn't do anything bad is
sufficient to declare this issue fixed. After all I'm probbly the last
person in the known universe using this driver. :)
Would it not be nicer to also explicitly set cs5535_clockevent's
state_use_accessors to CLOCK_EVT_STATE_DETACHED (in the initialization of
cs5535_clockevent)? Maybe that field is not to be touched directly from
outside, on the other hand in the current code the check in the handler
relies on CLOCK_EVT_STATE_DETACHED being 0 (and so cs5535_clockevent's
state_use_accessors being CLOCK_EVT_STATE_DETACHED by default).
I was also wondering how is it ensured that the handler can never see an
inconsistent state of state_use_accessors but after going through the
code I think the key is clockevents_register_device's
raw_spin_lock_irqsave(&clockevents_lock, flags); (even though the set to
DETACHED happens outside, but we're already starting in DETACHED state). I
also wonder if this would still be safe if there were multiple CPUs - but
this is not happening on the ALIX as it has just one CPU.
Best regards,
David
Powered by blists - more mailing lists