[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Message-ID: <59657502-5154-a2ff-ab5f-a432b217f9d6@akamai.com>
Date: Wed, 27 Feb 2019 11:45:40 -0500
From: Jason Baron <jbaron@...mai.com>
To: Al Viro <viro@...iv.linux.org.uk>
Cc: Rainer Weikusat <rweikusat@...ktalk.net>, netdev@...r.kernel.org
Subject: Re: [RFC] nasty corner case in unix_dgram_sendmsg()
On 2/26/19 6:59 PM, Al Viro wrote:
> On Tue, Feb 26, 2019 at 03:35:39PM -0500, Jason Baron wrote:
>
>>> I understand what the unix_dgram_peer_wake_me() is doing; I understand
>>> what unix_dgram_poll() is using it for. What I do not understand is
>>> what's the point of doing that in unix_dgram_sendmsg()...
>>>
>>
>> Hi,
>>
>> So the unix_dgram_peer_wake_me() in unix_dgram_sendmsg() is there for
>> epoll in edge-triggered mode. In that case, we want to ensure that if
>> -EAGAIN is returned a subsequent epoll_wait() is not stuck indefinitely.
>> Probably could use a comment...
>
> *owwww*
>
> Let me see if I've got it straight - you want the forwarding rearmed,
> so that it would match the behaviour of ep_poll_callback() (i.e.
> removing only when POLLFREE is passed)? Looks like an odd way to
> do it, if that's what's happening...
If unix_dgram_sendmsg() return -EAGAIN in this case, then a subsequent call
to poll()/select()/epoll_wait() is normally going to do the forwarding rearm
via unix_dgram_poll() (unless its already writeable). However, in the
special case of epoll with edge-trigger, the call to epoll_wait does not
call unix_dgram_poll() and thus the re-arm has to happen in
unix_dgram_sendmsg().
>
> While we are at it, why disarm a forwarder upon noticing that peer
> is dead? Wouldn't it be simpler to move that
> wake_up_interruptible_all(&u->peer_wait);
> in unix_release_sock() to just before
> unix_state_unlock(sk);
> a line prior? Then anyone seeing SOCK_DEAD on (locked) peer
> would be guaranteed that all forwarders are gone...
>
The condition we are checking here is unix_recvq_full(), so even if
the wakeup happens under the lock, we could end up waking up the
waiter that still sees unix_recvq_full() because the skb's aren't
freed until *after* the wakeup call. The race is described here:
51f7e95 af_unix: ensure POLLOUT on remote close() for connected dgram socket
Note, that I did have an earlier version of that patch that moved
the wake up call (instead of checking for SOCK_DEAD), see:
https://patchwork.ozlabs.org/patch/944593/
However, I thought that the explicit check for SOCK_DEAD made things
more explicit. IE we don't wait on a SOCK_DEAD socket.
> Another fun question about the same dgram sendmsg:
> if (unix_peer(sk) == other) {
> unix_peer(sk) = NULL;
> unix_dgram_peer_wake_disconnect_wakeup(sk, other);
>
> unix_state_unlock(sk);
>
> unix_dgram_disconnected(sk, other);
>
> ... and we are holding any locks at the last line. What happens
> if we have thread A doing
> decide which address to talk to
> connect(fd, that address)
> send request over fd (with send(2) or write(2))
> read reply from fd (recv(2) or read(2))
> in a loop, with thread B doing explicit sendto(2) over the same
> socket?
>
> Suppose B happens to send to the last server thread A was talking
> to and finds it just closed (e.g. because the last request from
> A had been "shut down", which server has honoured). B gets ECONNREFUSED,
> as it ought to, but it can also ends up disrupting the next exchange
> of A.
>
> Shouldn't we rather extract the skbs from that queue *before*
> dropping sk->lock? E.g. move them to a temporary queue, and flush
> that queue after we'd unlocked sk...
>
If I understand your concern, B drops the lock as above and then
A does a connect() to somewhere else and then B drops skbs from the
new source. Looks plausible. I think in general, A and B would probably
be co-ordinating if they are both reading/writing the same socket,
but I think it probably would make sense to fix this case. Note that,
unix_dgram_disconnected() is also called in unix_dgram_connect() after
the lock is dropped so that would need a similar fix.
Thanks,
-Jason
Powered by blists - more mailing lists