[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <Y9Bcbce4AuHqS/uf@grain>
Date: Wed, 25 Jan 2023 01:32:13 +0300
From: Cyrill Gorcunov <gorcunov@...il.com>
To: Kuniyuki Iwashima <kuniyu@...zon.com>
Cc: tkhai@...ru, davem@...emloft.net, edumazet@...gle.com,
kuba@...nel.org, netdev@...r.kernel.org, pabeni@...hat.com
Subject: Re: [PATCH net-next] unix: Guarantee sk_state relevance in case of
it was assigned by a task on other cpu
On Tue, Jan 24, 2023 at 09:57:12AM -0800, Kuniyuki Iwashima wrote:
> From: Kirill Tkhai <tkhai@...ru>
> Date: Mon, 23 Jan 2023 01:21:20 +0300
> > Some functions use unlocked check for sock::sk_state. This does not guarantee
> > a new value assigned to sk_state on some CPU is already visible on this CPU.
> >
> > Example:
> >
> > [CPU0:Task0] [CPU1:Task1]
> > unix_listen()
> > unix_state_lock(sk);
> > sk->sk_state = TCP_LISTEN;
> > unix_state_unlock(sk);
> > unix_accept()
> > if (sk->sk_state != TCP_LISTEN) /* not visible */
> > goto out; /* return error */
> >
> > Task1 may miss new sk->sk_state value, and then unix_accept() returns error.
> > Since in this situation unix_accept() is called chronologically later, such
> > behavior is not obvious and it is wrong.
>
> Have you seen this on a real workload ?
>
> It sounds like a userspace bug that accept() is called on a different
> CPU before listen() returns. At least, accept() is fetching sk at the
I must confess I don't get why accept() can't be called on different cpu
while listen() is in progress. As far as I see there is a small race window
which of course not critical at all since in worst case we simply report an
error back to userspace, still a nit worth to fix.
> same time, then I think there should be no guarantee that sk_state is
> TCP_LISTEN.
>
> Same for other TCP_ESTABLISHED tests, it seems a program is calling
> sendmsg()/recvmsg() when sk is still TCP_CLOSE and betting concurrent
> connect() will finish earlier.
Powered by blists - more mailing lists