[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <1480213570.18162.31.camel@edumazet-glaptop3.roam.corp.google.com>
Date: Sat, 26 Nov 2016 18:26:10 -0800
From: Eric Dumazet <eric.dumazet@...il.com>
To: Cong Wang <xiyou.wangcong@...il.com>
Cc: subashab@...eaurora.org, Thomas Graf <tgraf@...g.ch>,
Linux Kernel Network Developers <netdev@...r.kernel.org>
Subject: Re: Crash due to mutex genl_lock called from RCU context
On Sat, 2016-11-26 at 18:08 -0800, Cong Wang wrote:
> On Fri, Nov 25, 2016 at 8:54 PM, Eric Dumazet <eric.dumazet@...il.com> wrote:
> >
> > Oh well, this wont work, since sk->sk_destruct will be called from RCU
> > callback.
> >
> > Grabbing the mutex should not be done from netlink_sock_destruct() but
> > from netlink_release()
>
> But you also change the behavior of cb.done(), currently it is called when the
> last sock ref is gone, with your patch it is called when the first
> sock is closed.
No. It will be called when last refcount on the socket is released,
sk_refcnt transitions to final 0.
My patch changes the sock_hold() to the variant that makes sure
sk_refcnt is not 0 before increase, otherwise a race can happen and
release could be called twice.
Classic refcounting stuff coupled to rcu rules.
> No?
Are you telling me inet_release() is called when we close() the first
file descriptor ?
fd1 = socket()
fd2 = dup(fd1);
close(fd2) -> release() ???
>
> I don't see why we need to get genl_lock in ->done() here, because we are
> already the last sock using it and module ref protects the ops from being
> removed via module, seems we are pretty safe without any lock.
Well, at least this exposes a real bug in Thomas patch.
Removing the lock might be done for net-next, not stable branches.
Powered by blists - more mailing lists