[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20131112143147.GB6049@dhcp22.suse.cz>
Date: Tue, 12 Nov 2013 15:31:47 +0100
From: Michal Hocko <mhocko@...e.cz>
To: Shawn Bohrer <shawn.bohrer@...il.com>
Cc: Li Zefan <lizefan@...wei.com>, cgroups@...r.kernel.org,
linux-kernel@...r.kernel.org, tj@...nel.org,
Hugh Dickins <hughd@...gle.com>,
Johannes Weiner <hannes@...xchg.org>,
Markus Blank-Burian <burian@...nster.de>
Subject: Re: 3.10.16 cgroup_mutex deadlock
On Tue 12-11-13 18:17:20, Li Zefan wrote:
> Cc more people
>
> On 2013/11/12 6:06, Shawn Bohrer wrote:
> > Hello,
> >
> > This morning I had a machine running 3.10.16 go unresponsive but
> > before we killed it we were able to get the information below. I'm
> > not an expert here but it looks like most of the tasks below are
> > blocking waiting on the cgroup_mutex. You can see that the
> > resource_alloca:16502 task is holding the cgroup_mutex and that task
> > appears to be waiting on a lru_add_drain_all() to complete.
Do you have sysrq+l output as well by any chance? That would tell
us what the current CPUs are doing. Dumping all kworker stacks
might be helpful as well. We know that lru_add_drain_all waits for
schedule_on_each_cpu to return so it is waiting for workers to finish.
I would be really curious why some of lru_add_drain_cpu cannot finish
properly. The only reason would be that some work item(s) do not get CPU
or somebody is holding lru_lock.
--
Michal Hocko
SUSE Labs
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
Powered by blists - more mailing lists