[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-Id: <20110610113311.409bb423.kamezawa.hiroyu@jp.fujitsu.com>
Date: Fri, 10 Jun 2011 11:33:11 +0900
From: KAMEZAWA Hiroyuki <kamezawa.hiroyu@...fujitsu.com>
To: Hugh Dickins <hughd@...gle.com>
Cc: Ying Han <yinghan@...gle.com>, Dave Jones <davej@...hat.com>,
Linux Kernel <linux-kernel@...r.kernel.org>,
"linux-mm@...ck.org" <linux-mm@...ck.org>
Subject: Re: 3.0rc2 oops in mem_cgroup_from_task
On Thu, 9 Jun 2011 18:30:49 -0700 (PDT)
Hugh Dickins <hughd@...gle.com> wrote:
> On Fri, 10 Jun 2011, KAMEZAWA Hiroyuki wrote:
> > On Thu, 9 Jun 2011 16:42:09 -0700
> > Ying Han <yinghan@...gle.com> wrote:
> >
> > > ++cc Hugh who might have seen similar crashes on his machine.
>
> Yes, I was testing my tmpfs changes, and saw it on i386 yesterday
> morning. Same trace as Dave's (including khugepaged, which may or
> may not be relevant), aside from the i386/x86_64 differences.
>
> BUG: unable to handle kernel paging request at 6b6b6b87
>
> I needed to move forward with other work on that laptop, so just
> jotted down the details to come back to later. It came after one
> hour of building swapping load in memcg, I've not tried again since.
>
> >
> > Thank you for forwarding. Hmm. It seems the panic happens at khugepaged's
> > page collapse_huge_page().
>
> Yes, the inlining in my kernel was different,
> so collapse_huge_page() showed up in my backtrace.
>
> >
> > ==
> > count_vm_event(THP_COLLAPSE_ALLOC);
> > if (unlikely(mem_cgroup_newpage_charge(new_page, mm, GFP_KERNEL))) {
> > ==
> > It passes target mm to memcg and memcg gets a cgroup by
> > ==
> > mem = mem_cgroup_from_task(rcu_dereference(mm->owner));
> > ==
> > Panic here means....mm->owner's task_subsys_state contains bad pointer ?
>
> 781cc621 <mem_cgroup_from_task>:
> 781cc621: 55 push %ebp
> 781cc622: 31 c0 xor %eax,%eax
> 781cc624: 89 e5 mov %esp,%ebp
> 781cc626: 8b 55 08 mov 0x8(%ebp),%edx
> 781cc629: 85 d2 test %edx,%edx
> 781cc62b: 74 09 je 781cc636 <mem_cgroup_from_task+0x15>
> 781cc62d: 8b 82 fc 08 00 00 mov 0x8fc(%edx),%eax
> 781cc633: 8b 40 1c mov 0x1c(%eax),%eax <==========
> 781cc636: c9 leave
> 781cc637: c3 ret
>
then, access to task->cgroups->subsys[?] causes access to 6b6b6b87...
Then, task->cgroups or task->cgroups->subsys contains bad pointer.
Considering khugepaged, it grabs mm_struct and memcg make an access to
(mm->owner)->cgroups->subsys.
Then, from memcg's point of view, we need to doubt mm->owner is valid or not
for this kind of tasks.
Thank you for inputs.
-Kame
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
Powered by blists - more mailing lists