netdev - Re: [PATCH] netlink: don't send NETLINK

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [day] [month] [year] [list]

Date:	Sat, 16 Apr 2016 23:42:27 +0200
From:	Johannes Berg <johannes@...solutions.net>
To:	Herbert Xu <herbert@...dor.apana.org.au>
Cc:	netdev@...r.kernel.org, dmitrijs.ivanovs@...t.com,
	linux-wireless@...r.kernel.org
Subject: Re: [PATCH] netlink: don't send NETLINK_URELEASE for unbound sockets

On Sat, 2016-04-16 at 14:30 +0800, Herbert Xu wrote:
> Johannes Berg <johannes@...solutions.net> wrote:
> > 
> > 
> > diff --git a/net/netlink/af_netlink.c b/net/netlink/af_netlink.c
> > index 215fc08c02ab..330ebd600f25 100644
> > --- a/net/netlink/af_netlink.c
> > +++ b/net/netlink/af_netlink.c
> > @@ -688,7 +688,7 @@ static int netlink_release(struct socket *sock)
> > 
> >        skb_queue_purge(&sk->sk_write_queue);
> > 
> > -       if (nlk->portid) {
> > +       if (nlk->portid && nlk->bound) {
> Any reason why we're still testing portid at all? Isn't testing
> bound enough?

The reason is that while I felt confident in understanding the problem
and its solution, I didn't realize that bound follows portid, and that,
for example, a kernel socket can't get bound. :-)

But yeah - looking at the code after your comment does show that
testing portid is pointless now.

In fact, now that I look at it, I wonder how this situation came about?

In the current code, you can only trigger the problematic situation
through an error condition, afaict - you have to try to bind a socket
with the same portid as an already existing one, and __netlink_insert()
will then fail, but netlink_insert() won't reset the nlk_sk(sk)->portid 
since your commit da314c9923fe ("netlink: Replace rhash_portid with
bound"), even though you had previously fixed precisely this issue
already - commit c0bb07df7d98 ("netlink: Reset portid after
netlink_insert failure"). You then close the socket, and since portid
is assigned, that will trigger NETLINK_URELEASE.

So before the rhash_portid -> bound change, this doesn't seem to have
been a problem, and I guess we didn't get the "stuck in limbo" problem
back because we now check bound everywhere relevant, for purposes of
calling netlink_autobind.

IOW - I'm now even more convinced that the patch is correct, and I'm
also convinced that until fairly recently (da314c9923fe) this wasn't
even a problem.

Maybe we can take the opportunity of removing the "portid" check to put
most of the above into a commit message, but I think you should review
it first.

johannes