linux-kernel - Re: 2.6.30-rc1 regression? -- epoll: BUG: sleeping function called from invalid context

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <alpine.DEB.1.10.0906151621480.21001@makko.or.mcafeemobile.com>
Date:	Mon, 15 Jun 2009 16:32:20 -0700 (PDT)
From:	Davide Libenzi <davidel@...ilserver.org>
To:	Stefan Richter <stefanr@...6.in-berlin.de>
cc:	Linux Kernel Mailing List <linux-kernel@...r.kernel.org>
Subject: Re: 2.6.30-rc1 regression? -- epoll: BUG: sleeping function called
 from invalid context

On Tue, 16 Jun 2009, Stefan Richter wrote:

> Looks like a regression after 2.6.29, before 2.6.30-rc1, caused by
> commit 5071f97ec6d74f006072de0ce89b67c8792fe5a1, "epoll: fix epoll's own
> poll" (since this introduced ep_scan_ready_list), but I haven't fully
> investigated yet whether this is really the cause.
> 
> Test case:  Run any libraw1394 or libdc1394 based program on
> firewire-core on a kernel with the usual selection of debugging options
> configured in.  I didn't have these options enabled for a while, hence
> noticed only now.
> 
> BUG: sleeping function called from invalid context at kernel/mutex.c:278
> in_atomic(): 1, irqs_disabled(): 0, pid: 8301, name: dvgrab
> no locks held by dvgrab/8301.
> Pid: 8301, comm: dvgrab Tainted: G	  W  2.6.30 #2
> Call Trace:
> [<ffffffff80250bb2>] ? __debug_show_held_locks+0x22/0x24
> [<ffffffff8022a91f>] __might_sleep+0x120/0x122
> [<ffffffff8045192b>] mutex_lock_nested+0x25/0x2eb
> [<ffffffff80253c25>] ? __lock_acquire+0x705/0x793
> [<ffffffff802b8f5a>] ep_scan_ready_list+0x3c/0x185
> [<ffffffff802b997f>] ? ep_read_events_proc+0x0/0x6c
> [<ffffffff802b90b5>] ep_poll_readyevents_proc+0x12/0x14
> [<ffffffff802b8b4f>] ep_call_nested+0x9f/0xfa
> [<ffffffff802b90a3>] ? ep_poll_readyevents_proc+0x0/0x14
> [<ffffffff802b8bf7>] ep_eventpoll_poll+0x4d/0x5b
> [<ffffffff8029d525>] do_sys_poll+0x1b4/0x3b5
> [<ffffffff8029e17a>] ? __pollwait+0x0/0xce
> [<ffffffff8029e248>] ? pollwake+0x0/0x52
> [<ffffffff8025162b>] ? mark_held_locks+0x4d/0x6a
> [<ffffffff8020b900>] ? restore_args+0x0/0x30
> [<ffffffff80251753>] ? trace_hardirqs_on_caller+0x10b/0x12f
> [<ffffffff8025162b>] ? mark_held_locks+0x4d/0x6a
> [<ffffffff8020b900>] ? restore_args+0x0/0x30
> [<ffffffff80253c25>] ? __lock_acquire+0x705/0x793
> [<ffffffff80251753>] ? trace_hardirqs_on_caller+0x10b/0x12f
> [<ffffffff80251784>] ? trace_hardirqs_on+0xd/0xf
> [<ffffffff80235a69>] ? timespec_add_safe+0x34/0x61
> [<ffffffff802b9070>] ? ep_scan_ready_list+0x152/0x185
> [<ffffffff802483f9>] ? ktime_get_ts+0x49/0x4e
> [<ffffffff8029d26b>] ? poll_select_set_timeout+0x5c/0x7f
> [<ffffffff8029d778>] sys_poll+0x52/0xb2
> [<ffffffff8020aeab>] system_call_fastpath+0x16/0x1b
> 
> Any idea how to approach this?

That's not the problem. The problem is the patch changing the cookie from 
"current" to the current CPU (hence bumping preempt count with get_cpu()).
Need a fix. Working on it ...



- Davide


--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/