[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20140820120744.11b2ab1d@bbrezillon>
Date: Wed, 20 Aug 2014 12:07:44 +0200
From: Boris BREZILLON <boris.brezillon@...e-electrons.com>
To: Thierry Reding <thierry.reding@...il.com>
Cc: Jean-Christophe PLAGNIOL-VILLARD <plagnioj@...osoft.com>,
Gaƫl PORTAY <gael.portay@...il.com>,
Arnd Bergmann <arnd@...db.de>,
Daniel Lezcano <daniel.lezcano@...aro.org>,
Greg Kroah-Hartman <gregkh@...uxfoundation.org>,
linux-arm-kernel@...ts.infradead.org, linux-kernel@...r.kernel.org,
linux-pwm@...r.kernel.org, Nicolas FERRE <nicolas.ferre@...el.com>,
Thomas Gleixner <tglx@...utronix.de>,
Alexandre Belloni <alexandre.belloni@...e-electrons.com>
Subject: Re: [PATCH 3/3] ARM: at91/tclib: mask interruptions at shutdown and
probe
On Wed, 20 Aug 2014 11:48:08 +0200
Thierry Reding <thierry.reding@...il.com> wrote:
> On Wed, Aug 20, 2014 at 11:06:25AM +0200, Boris BREZILLON wrote:
> > On Wed, 20 Aug 2014 10:28:20 +0200
> > Thierry Reding <thierry.reding@...il.com> wrote:
> >
> > > On Wed, Aug 20, 2014 at 10:14:22AM +0200, Boris BREZILLON wrote:
> > > > Hi Thierry,
> > > >
> > > > On Wed, 20 Aug 2014 09:31:13 +0200
> > > > Thierry Reding <thierry.reding@...il.com> wrote:
> > > >
> > > > > On Wed, Aug 20, 2014 at 01:01:30AM +0200, Boris BREZILLON wrote:
> > > > > > Hi Jean-Christophe,
> > > > > >
> > > > > > On Wed, 20 Aug 2014 06:11:17 +0800
> > > > > > Jean-Christophe PLAGNIOL-VILLARD <plagnioj@...osoft.com> wrote:
> > > > > >
> > > > > > > Hi,
> > > > > > >
> > > > > > > This is a bit weird as the clock of the TC should be off and the irq free
> > > > > > >
> > > > > > > so this should never happened we need to investigate more why this append
> > > > > >
> > > > > > I may have found the source of this bug.
> > > > > >
> > > > > > As Gael stated, when you're kexec-ing a new kernel your previous kernel
> > > > > > could be using the tbc_clksrc driver (and especially the clkevent
> > > > > > device). Thus the kernel might have planned a timer event and then been
> > > > > > asked to shutdown the machine (requested by the kexec code).
> > > > > > In this case the AIC interrupt connected to the TC Block is disabled
> > > > > > but not the interrupts within the TCB IP (IDR registers), possibly
> > > > > > leaving a pending interrupt before booting the new kernel.
> > > > > >
> > > > > > When the tcb_clksrc driver is loaded by the new kernel it enables the
> > > > > > interrupt line by calling setup_irq [1] while the clockevent device is
> > > > > > not registered yet [2]. Thus the event_handler is still NULL when the
> > > > > > AIC line connected to the TCB is unmasked. Remember that an interrupt
> > > > > > is still pending on this HW block, which will lead to an immediate call
> > > > > > to the ch2_irq handler, which tries to call the event_handler, which in
> > > > > > turns is NULL because clkevent device registration has not taken place
> > > > > > at this moment => Kernel panic.
> > > > > > ITOH, we can't register the clkevent device before the irq handler is
> > > > > > set up, because we should be ready to handle clkevent request at the
> > > > > > time clockevents_config_and_register is called.
> > > > > >
> > > > > > This leaves two solution:
> > > > > > 1) disable the TCB irqs (using TCB IDR registers) before calling
> > > > > > setup_irq in the tcb_clksrc driver
> > > > > > 2) disable the TCB irqs at the tclib level (as proposed by Gael)
> > > > > >
> > > > > > I prefer solution #2 because it fixes the bug for all TCB users (not
> > > > > > just the tcb_clksrc driver).
> > > > >
> > > > > Wouldn't a more proper fix be to only enable the IRQ (setup_irq()) once
> > > > > everything has properly been set up? That's certainly how all other
> > > > > drivers are doing this. Generally I think it's best to assume that an
> > > > > interrupt can fire at any point after it's been enabled, so everything
> > > > > should be set up prior to enabling it.
> > > >
> > > > Sure. And, AFAIK, another common practice is to disable all interrupts
> > > > and acknowledge all pending interrupts before registering a new irq
> > > > handler to avoid inheriting peripheral dirty state from previous usage
> > > > (either the bootloader, or the previous kernel when using kexec).
> > >
> > > Discarding all pending interrupts may not always be what we want. And
> > > masking interrupts prior to registering the handler isn't always going
> > > to work either (shared interrupts), so device drivers should always set
> > > things up in the correct order.
> > >
> >
> > I meant disabling/acknowledging interrupts within the HW block not
> > the interrupt line connected to the interrupt controller (which indeed
> > can be shared among several peripherals).
> > The TCB IP provides SR (Status Register) to acknowledge interrupts at
> > the TCB level and IER/IDR/ISR (Interrupt Enable/Disable/Status
> > Register) to manipulate TCB interrupts.
>
> But when you share interrupts, then when an incoming interrupt will
> cause all handlers to be called, so you still need to set it up
> properly.
Right, I forgot about that one (even if we could mask the status
register with the interrupt status register to avoid calling the
event_handler when the interrupt is not enabled).
Anyway I agree with you on this point: everything should be ready when
calling request_irq.
--
Boris Brezillon, Free Electrons
Embedded Linux and Kernel engineering
http://free-electrons.com
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
Powered by blists - more mailing lists