lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <1311965462.2351.28.camel@t41.thuisdomein>
Date:	Fri, 29 Jul 2011 20:50:55 +0200
From:	Paul Bolle <pebolle@...cali.nl>
To:	Alexander Viro <viro@...iv.linux.org.uk>,
	linux-fsdevel@...r.kernel.org,
	Davide Libenzi <davidel@...ilserver.org>,
	Nelson Elhage <nelhage@...lice.com>
Cc:	Linux Kernel <linux-kernel@...r.kernel.org>,
	davidel@...ilserver.org, Dave Jones <davej@...hat.com>
Subject: Re: recursive locking: epoll.

(Sent to the addresses get_maintainer.pl suggested and to Davide and
Nelson, because this is about code they cared about half a year ago.
CC'ed to the addresses involved until now.)

On Thu, 2011-07-21 at 13:55 +0200, Paul Bolle wrote:
> That number turned out to be 722472
> ( https://bugzilla.redhat.com/show_bug.cgi?id=722472 ).

0) This seems to be a lockdep false alarm. The cause is an epoll
instance added to another epoll instance (ie, nesting epoll instances).
Apparently lockdep isn't supplied enough information to determine what's
going on here. Now there might be a number of ways to fix this. But
after having looked at this for quite some time and updating the above
bug report a number of times, I guessed that involving people outside
those tracking that report might move things forward towards a solution.
At least, I wasn't able to find a, well, clean solution.

1) The call chain triggering the warning with the nice
    *** DEADLOCK ***

line can be summarized like this:

sys_epoll_ctl
    mutex_lock                          epmutex
    ep_call_nested
        ep_loop_check_proc
            mutex_lock                      ep->mtx
            mutex_unlock                    ep->mtx
    mutex_lock                              ep->mtx
    ep_eventpoll_poll
        ep_ptable_queue_proc
        ep_call_nested
            ep_poll_readyevents_pro
                ep_scan_ready_list
                    mutex_lock                  ep->mtx
                    ep_read_events_proc
                    mutex_unlock                ep->mtx
    mutex_unlock                            ep->mtx
    mutex_unlock                        epmutex

2) When ep_scan_ready_list() calls mutex_lock(), lockdep notices
recursive locking on ep->mtx. It is not supplied enough information to
determine that the lock is related to two separate epoll instances (the
outer instance and the nested instance). The solution appears to involve
supplying lockdep that information (ie, "lockdep annotation"). 

3) Please see the bugzilla.redhat.com report for further background.


Paul Bolle

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ