[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <1421878320.4903.17.camel@stgolabs.net>
Date: Wed, 21 Jan 2015 14:12:00 -0800
From: Davidlohr Bueso <dave@...olabs.net>
To: Bruno Prémont <bonbons@...ux-vserver.org>
Cc: Linus Torvalds <torvalds@...ux-foundation.org>,
Peter Zijlstra <peterz@...radead.org>,
Linux Kernel Mailing List <linux-kernel@...r.kernel.org>,
tglx@...utronix.de, ilya.dryomov@...tank.com,
umgwanakikbuti@...il.com, oleg@...hat.com
Subject: Re: Linux 3.19-rc5
On Wed, 2015-01-21 at 22:37 +0100, Bruno Prémont wrote:
> On Wed, 21 January 2015 Bruno Prémont wrote:
> > On Tue, 20 January 2015 Linus Torvalds wrote:
> > > On Tue, Jan 20, 2015 at 6:02 AM, Bruno Prémont wrote:
> > > >
> > > > No idea yet which rc is the offender (nor exact patch), but on my not
> > > > so recent UP laptop with a pccard slot I have 2 pccardd kernel threads
> > > > converting my laptop into a heater.
> > > >
> > > > lspci for affected nodes:
> > > > 02:06.0 CardBus bridge [0607]: O2 Micro, Inc. OZ711EC1 SmartCardBus Controller [1217:7113] (rev 20)
> > > > 02:06.1 CardBus bridge [0607]: O2 Micro, Inc. OZ711EC1 SmartCardBus Controller [1217:7113] (rev 20)
> > > >
> > > > Very basics I have, before I attempt any bisection:
> > >
> > > Hmm. I'm not seeing anything recent changing anything in this area, so
> > > I suspect that unless somebody else steps up and says "Ahh, that
> > > sounds like xyz", your bisection is the best option.
>
> Bisecting to the end did point me at (the warning traces produced in great
> quantities might not be the very same issue as the abusive CPU usage, but
> certainly look very related):
> [CCing people on CC for the patch]
>
> commit 8eb23b9f35aae413140d3fda766a98092c21e9b0
> Author: Peter Zijlstra <peterz@...radead.org>
> Date: Wed Sep 24 10:18:55 2014 +0200
>
> sched: Debug nested sleeps
>
> Validate we call might_sleep() with TASK_RUNNING, which catches places
> where we nest blocking primitives, eg. mutex usage in a wait loop.
>
> Since all blocking is arranged through task_struct::state, nesting
> this will cause the inner primitive to set TASK_RUNNING and the outer
> will thus not block.
>
> Another observed problem is calling a blocking function from
> schedule()->sched_submit_work()->blk_schedule_flush_plug() which will
> then destroy the task state for the actual __schedule() call that
> comes after it.
>
> Signed-off-by: Peter Zijlstra (Intel) <peterz@...radead.org>
> Cc: tglx@...utronix.de
> Cc: ilya.dryomov@...tank.com
> Cc: umgwanakikbuti@...il.com
> Cc: oleg@...hat.com
> Cc: Linus Torvalds <torvalds@...ux-foundation.org>
> Link: http://lkml.kernel.org/r/20140924082242.591637616@infradead.org
> Signed-off-by: Ingo Molnar <mingo@...nel.org>
>
> Which does produce the following trace (hand-copied most important parts of it):
> Warning: CPU 0 PID: 68 at kernel/sched/core.c:7311 __might_sleep+0x143/0x170
> do not call blocking ops when !TASK_RUNNING; state=1 set at [<c1436390>] pccardd+0xa0/0x3e0
> ...
> Call trace:
> ...
> __might_sleep+0x143/0x170
> ? pccardd+0xa0/0x3e0
> ? pccardd+0xa0/0x3e0
> mutex_lock+0x17/0x2a
> pccardd+0xe9/0x3e0
> ? pcmcia_socket_uevent+0x30/0x30
>
> pccardd() is located in drivers/pcmcia/cs.c and seems to be of the structure
> Peter's patch wants to warn about.
Yeah setting current to interruptable so early in the game is bogus. It
should be set after unlocking the skt_mutex.
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
Powered by blists - more mailing lists