[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <507FD8AA.50500@canonical.com>
Date: Thu, 18 Oct 2012 12:23:38 +0200
From: Stefan Bader <stefan.bader@...onical.com>
To: cwillu <cwillu@...llu.com>
CC: mingo@...nel.org, hpa@...or.com, linux-kernel@...r.kernel.org,
a.p.zijlstra@...llo.nl, peterz@...radead.org, tglx@...utronix.de
Subject: Re: [tip:sched/core] sched: Fix race in task_group()
On 18.10.2012 10:27, cwillu wrote:
> On Tue, Jul 24, 2012 at 8:21 AM, tip-bot for Peter Zijlstra
> <peterz@...radead.org> wrote:
>> Commit-ID: 8323f26ce3425460769605a6aece7a174edaa7d1
>> Gitweb: http://git.kernel.org/tip/8323f26ce3425460769605a6aece7a174edaa7d1
>> Author: Peter Zijlstra <peterz@...radead.org>
>> AuthorDate: Fri, 22 Jun 2012 13:36:05 +0200
>> Committer: Ingo Molnar <mingo@...nel.org>
>> CommitDate: Tue, 24 Jul 2012 13:58:20 +0200
>>
>> sched: Fix race in task_group()
>>
>> Stefan reported a crash on a kernel before a3e5d1091c1 ("sched:
>> Don't call task_group() too many times in set_task_rq()"), he
>> found the reason to be that the multiple task_group()
>> invocations in set_task_rq() returned different values.
>>
>> Looking at all that I found a lack of serialization and plain
>> wrong comments.
>>
>> The below tries to fix it using an extra pointer which is
>> updated under the appropriate scheduler locks. Its not pretty,
>> but I can't really see another way given how all the cgroup
>> stuff works.
>>
>> Reported-and-tested-by: Stefan Bader <stefan.bader@...onical.com>
>> Signed-off-by: Peter Zijlstra <a.p.zijlstra@...llo.nl>
>> Link: http://lkml.kernel.org/r/1340364965.18025.71.camel@twins
>> Signed-off-by: Ingo Molnar <mingo@...nel.org>
>
> I just finished bisecting a crash on boot to this commit; booting with
> "noautogroup" brings it back.
>
> 3.5.4 is the latest -stable that still boots, and none of the 3.6 rc's
> boot at all.
>
> Photo of the bug (3.6.0next is 3.6 + btrfs's for-linus):
> https://lh5.googleusercontent.com/-0DY-YYhgvzs/UHdB-BQdzMI/AAAAAAAAAEg/QhY9rgxnv98/s811/2012-10-11
>
On a very quick glance I wonder whether there might be a case where sched_fork
goes into set_task_cpu with a different cpu than the current but has not yet
task_group.sched_task_group set to something valid...
Download attachment "signature.asc" of type "application/pgp-signature" (898 bytes)
Powered by blists - more mailing lists