[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <1311965462.2351.28.camel@t41.thuisdomein>
Date:	Fri, 29 Jul 2011 20:50:55 +0200
From:	Paul Bolle <pebolle@...cali.nl>
To:	Alexander Viro <viro@...iv.linux.org.uk>,
	linux-fsdevel@...r.kernel.org,
	Davide Libenzi <davidel@...ilserver.org>,
	Nelson Elhage <nelhage@...lice.com>
Cc:	Linux Kernel <linux-kernel@...r.kernel.org>,
	davidel@...ilserver.org, Dave Jones <davej@...hat.com>
Subject: Re: recursive locking: epoll.
(Sent to the addresses get_maintainer.pl suggested and to Davide and
Nelson, because this is about code they cared about half a year ago.
CC'ed to the addresses involved until now.)
On Thu, 2011-07-21 at 13:55 +0200, Paul Bolle wrote:
> That number turned out to be 722472
> ( https://bugzilla.redhat.com/show_bug.cgi?id=722472 ).
0) This seems to be a lockdep false alarm. The cause is an epoll
instance added to another epoll instance (ie, nesting epoll instances).
Apparently lockdep isn't supplied enough information to determine what's
going on here. Now there might be a number of ways to fix this. But
after having looked at this for quite some time and updating the above
bug report a number of times, I guessed that involving people outside
those tracking that report might move things forward towards a solution.
At least, I wasn't able to find a, well, clean solution.
1) The call chain triggering the warning with the nice
    *** DEADLOCK ***
line can be summarized like this:
sys_epoll_ctl
    mutex_lock                          epmutex
    ep_call_nested
        ep_loop_check_proc
            mutex_lock                      ep->mtx
            mutex_unlock                    ep->mtx
    mutex_lock                              ep->mtx
    ep_eventpoll_poll
        ep_ptable_queue_proc
        ep_call_nested
            ep_poll_readyevents_pro
                ep_scan_ready_list
                    mutex_lock                  ep->mtx
                    ep_read_events_proc
                    mutex_unlock                ep->mtx
    mutex_unlock                            ep->mtx
    mutex_unlock                        epmutex
2) When ep_scan_ready_list() calls mutex_lock(), lockdep notices
recursive locking on ep->mtx. It is not supplied enough information to
determine that the lock is related to two separate epoll instances (the
outer instance and the nested instance). The solution appears to involve
supplying lockdep that information (ie, "lockdep annotation"). 
3) Please see the bugzilla.redhat.com report for further background.
Paul Bolle
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/
Powered by blists - more mailing lists
 
