[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <alpine.DEB.2.11.1507031041570.3916@nanos>
Date: Fri, 3 Jul 2015 11:23:12 +0200 (CEST)
From: Thomas Gleixner <tglx@...utronix.de>
To: Geert Uytterhoeven <geert@...ux-m68k.org>
cc: Simon Horman <horms@...ge.net.au>,
Kevin Hilman <khilman@...nel.org>,
Tyler Baker <tyler.baker@...aro.org>,
Borislav Petkov <bp@...en8.de>,
Geert Uytterhoeven <geert+renesas@...der.be>,
Magnus Damm <magnus.damm@...il.com>,
Linux-sh list <linux-sh@...r.kernel.org>,
"linux-arm-kernel@...ts.infradead.org"
<linux-arm-kernel@...ts.infradead.org>,
"linux-kernel@...r.kernel.org" <linux-kernel@...r.kernel.org>
Subject: Re: Possible regression due to "tick: broadcast: Prevent livelock
from event handler"
On Fri, 3 Jul 2015, Geert Uytterhoeven wrote:
> Hi Simon,
>
> On Fri, Jul 3, 2015 at 4:40 AM, Simon Horman <horms@...ge.net.au> wrote:
> > I have observed what appears to be a regression while testing next-20150702
> > which seems to be caused by 2951d5c031a3 ("tick: broadcast: Prevent
> > livelock from event handler").
> >
> > The problem manifests on the emev2/kzm9d board as per the boot log below.
> >
> > The problem manifests when booting using the shmobile_defconfig,
> > which uses multiplatform and enables all devices using DT.
> >
> > The problem does not appear to always manifest but anecdotally it
> > seems to manifest more often of late (yes, I know that is vague).
>
> > hctosys: unable to open rtc device (rtc0)
> >
> > The boot hangs here.
> > The next line should be:
> >
> > smsc911x 20000000.ethernet eth0: SMSC911x/921x identified at 0xc8880000, IRQ: 33
>
> As you can reproduce it, can you please try enabling lockdep debugging?
Just looking at the em_sti driver. It calls clk_prepare/unprepare from
interrupt disabled regions ...
But that's not the problem at hand I think. The above commit is moving
the call to the event handler on the local cpu out of the broadcast
lock region to prevent a live lock. The only real change is the
timing.
Before:
bc_handler()
lock(bc_lock);
call_local_handler();
send_ipis();
reprogramm_bc_device();
unlock(bc_lock);
After:
bc_handler()
lock(bc_lock);
send_ipis();
reprogramm_bc_device();
unlock(bc_lock);
call_local_handler();
As this runs in hard interrupt context with interrupts disabled, I
really cannot figure out how that makes a difference.
Can you add some debugging to figure out whether the broadcast timer
interrupt still fires?
Thanks,
tglx
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
Powered by blists - more mailing lists