linux-kernel - Re: [PATCH] sched: fix another race when reading /proc/sched

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <20081216122350.GB25019@elte.hu>
Date:	Tue, 16 Dec 2008 13:23:51 +0100
From:	Ingo Molnar <mingo@...e.hu>
To:	Li Zefan <lizf@...fujitsu.com>
Cc:	Peter Zijlstra <a.p.zijlstra@...llo.nl>,
	Paul Menage <menage@...gle.com>,
	Andrew Morton <akpm@...ux-foundation.org>,
	LKML <linux-kernel@...r.kernel.org>
Subject: Re: [PATCH] sched: fix another race when reading /proc/sched_debug


* Li Zefan <lizf@...fujitsu.com> wrote:

> Li Zefan wrote:
> >> i merged it up in tip/master, could you please check whether it's ok?
> >>
> > 
> > Sorry, though this patch avoids accessing a half-created cgroup, but I found
> > current code may access a cgroup which has been destroyed.
> > 
> > The simplest fix is to take cgroup_lock() before for_each_leaf_cfs_rq.
> > 
> > Could you revert this patch and apply the following new one? My box has
> > survived for 16 hours with it applied.
> > 
> 
> Hi, Ingo
> 
> Can we have this bug fixed for 2.6.28 using this patch ? This patch is 
> the simplest fix and has been fully tested.

the mutex used by cgroup_lock() is pretty crappy to nest inside the 
runqueue lock:

 BUG: sleeping function called from invalid context at kernel/mutex7
 in_atomic(): 0, irqs_disabled(): 1, pid: 1790, name: cat
 2 locks held by cat/1790:
  #0:  (&p->lock){--..}, at: [<c02a6328>] seq_read+0x25/0x2b8
  #1:  (tasklist_lock){..--}, at: [<c021f5bc>] 
sched_debug_show+0xaa7/0xe90
 Call Trace:
  [<c0222612>] __might_sleep+0xd6/0xdd
  [<c05ff1eb>] mutex_lock_nested+0x1d/0x245
  [<c021f88b>] ? sched_debug_show+0xd76/0xe90
  [<c0256223>] cgroup_lock+0xf/0x11
  [<c021f893>] sched_debug_show+0xd7e/0xe90
  [<c0248696>] ? __lock_acquire+0x637/0x69d
  [<c028e7cd>] ? check_object+0x111/0x18c
  [<c028f435>] ? kmem_cache_alloc+0x70/0xa5
  [<c02a6355>] ? seq_read+0x52/0x2b8
  [<c0246ee2>] ? trace_hardirqs_on_caller+0x105/0x13d
  [<c0246f25>] ? trace_hardirqs_on+0xb/0xd
  [<c02a6355>] ? seq_read+0x52/0x2b8
  [<c02a63f7>] seq_read+0xf4/0x2b8
  [<c02a6303>] ? seq_read+0x0/0x2b8
  [<c02c411a>] proc_reg_read+0x60/0x74

so i had to remove your patch.

also, while looking at what cgroup_lock does - it's a trivial wrapper:

 void cgroup_lock(void)
 {
         mutex_lock(&cgroup_mutex);
 }

why isnt that done explicitly in all usage sites? If cgroup-unaware code 
ever has to take the cgroup lock outside of CONFIG_CGROUPS, that's a code 
structure problem.

	Ingo
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/