[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <CANn89i+gGYpCojJ8=zzZfpwTWhPF6T+rWqstu1D1rGLjoaa-xQ@mail.gmail.com>
Date: Mon, 20 May 2024 16:49:12 +0200
From: Eric Dumazet <edumazet@...gle.com>
To: Paolo Abeni <pabeni@...hat.com>
Cc: netdev@...r.kernel.org, "David S. Miller" <davem@...emloft.net>,
David Ahern <dsahern@...nel.org>, Jakub Kicinski <kuba@...nel.org>,
Christoph Paasch <cpaasch@...le.com>
Subject: Re: [PATCH net] tcp: ensure sk_showdown is 0 for listening sockets
On Mon, May 20, 2024 at 4:46 PM Paolo Abeni <pabeni@...hat.com> wrote:
>
> Hi,
>
> On Mon, 2024-05-20 at 16:07 +0200, Eric Dumazet wrote:
> > On Mon, May 20, 2024 at 3:46 PM Eric Dumazet <edumazet@...gle.com> wrote:
> > >
> > > On Mon, May 20, 2024 at 12:05 PM Paolo Abeni <pabeni@...hat.com> wrote:
> > > >
> > > > Christoph reported the following splat:
> > > >
> > > > WARNING: CPU: 1 PID: 772 at net/ipv4/af_inet.c:761 __inet_accept+0x1f4/0x4a0
> > > > Modules linked in:
> > > > CPU: 1 PID: 772 Comm: syz-executor510 Not tainted 6.9.0-rc7-g7da7119fe22b #56
> > > > Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.11.0-2.el7 04/01/2014
> > > > RIP: 0010:__inet_accept+0x1f4/0x4a0 net/ipv4/af_inet.c:759
> > > > Code: 04 38 84 c0 0f 85 87 00 00 00 41 c7 04 24 03 00 00 00 48 83 c4 10 5b 41 5c 41 5d 41 5e 41 5f 5d c3 cc cc cc cc e8 ec b7 da fd <0f> 0b e9 7f fe ff ff e8 e0 b7 da fd 0f 0b e9 fe fe ff ff 89 d9 80
> > > > RSP: 0018:ffffc90000c2fc58 EFLAGS: 00010293
> > > > RAX: ffffffff836bdd14 RBX: 0000000000000000 RCX: ffff888104668000
> > > > RDX: 0000000000000000 RSI: 0000000000000000 RDI: 0000000000000000
> > > > RBP: dffffc0000000000 R08: ffffffff836bdb89 R09: fffff52000185f64
> > > > R10: dffffc0000000000 R11: fffff52000185f64 R12: dffffc0000000000
> > > > R13: 1ffff92000185f98 R14: ffff88810754d880 R15: ffff8881007b7800
> > > > FS: 000000001c772880(0000) GS:ffff88811b280000(0000) knlGS:0000000000000000
> > > > CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
> > > > CR2: 00007fb9fcf2e178 CR3: 00000001045d2002 CR4: 0000000000770ef0
> > > > DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
> > > > DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
> > > > PKRU: 55555554
> > > > Call Trace:
> > > > <TASK>
> > > > inet_accept+0x138/0x1d0 net/ipv4/af_inet.c:786
> > > > do_accept+0x435/0x620 net/socket.c:1929
> > > > __sys_accept4_file net/socket.c:1969 [inline]
> > > > __sys_accept4+0x9b/0x110 net/socket.c:1999
> > > > __do_sys_accept net/socket.c:2016 [inline]
> > > > __se_sys_accept net/socket.c:2013 [inline]
> > > > __x64_sys_accept+0x7d/0x90 net/socket.c:2013
> > > > do_syscall_x64 arch/x86/entry/common.c:52 [inline]
> > > > do_syscall_64+0x58/0x100 arch/x86/entry/common.c:83
> > > > entry_SYSCALL_64_after_hwframe+0x76/0x7e
> > > > RIP: 0033:0x4315f9
> > > > Code: fd ff 48 81 c4 80 00 00 00 e9 f1 fe ff ff 0f 1f 00 48 89 f8 48 89 f7 48 89 d6 48 89 ca 4d 89 c2 4d 89 c8 4c 8b 4c 24 08 0f 05 <48> 3d 01 f0 ff ff 0f 83 ab b4 fd ff c3 66 2e 0f 1f 84 00 00 00 00
> > > > RSP: 002b:00007ffdb26d9c78 EFLAGS: 00000246 ORIG_RAX: 000000000000002b
> > > > RAX: ffffffffffffffda RBX: 0000000000400300 RCX: 00000000004315f9
> > > > RDX: 0000000000000000 RSI: 0000000000000000 RDI: 0000000000000004
> > > > RBP: 00000000006e1018 R08: 0000000000400300 R09: 0000000000400300
> > > > R10: 0000000000400300 R11: 0000000000000246 R12: 0000000000000000
> > > > R13: 000000000040cdf0 R14: 000000000040ce80 R15: 0000000000000055
> > > > </TASK>
> > > >
> > > > Listener sockets are supposed to have a zero sk_shutdown, as the
> > > > accepted children will inherit such field.
> > > >
> > > > Invoking shutdown() before entering the listener status allows
> > > > violating the above constraint.
> > > >
> > > > After commit 94062790aedb ("tcp: defer shutdown(SEND_SHUTDOWN) for
> > > > TCP_SYN_RECV sockets"), the above causes the child to reach the accept
> > > > syscall in FIN_WAIT1 status.
> > > >
> > > > Address the issue explicitly by clearing sk_shutdown at listen time.
> > > >
> > > > Reported-by: Christoph Paasch <cpaasch@...le.com>
> > > > Closes: https://github.com/multipath-tcp/mptcp_net-next/issues/490
> > > > Fixes: 1da177e4c3fu ("Linux-2.6.12-rc2")
> > > > Signed-off-by: Paolo Abeni <pabeni@...hat.com>
> > > > ---
> > > > Note: the issue above reports an MPTCP reproducer, but I can reproduce
> > > > the issue even using plain TCP sockets only.
> > > > ---
> > > > net/ipv4/inet_connection_sock.c | 2 ++
> > > > 1 file changed, 2 insertions(+)
> > > >
> > > > diff --git a/net/ipv4/inet_connection_sock.c b/net/ipv4/inet_connection_sock.c
> > > > index 3b38610958ee..dab723fea0cc 100644
> > > > --- a/net/ipv4/inet_connection_sock.c
> > > > +++ b/net/ipv4/inet_connection_sock.c
> > > > @@ -1269,6 +1269,8 @@ int inet_csk_listen_start(struct sock *sk)
> > > >
> > > > reqsk_queue_alloc(&icsk->icsk_accept_queue);
> > > >
> > > > + /* closed sockets can have non zero sk_shutdown */
> > > > + WRITE_ONCE(sk->sk_shutdown, 0);
> > >
> > > Hi Paolo.
> > >
> > > I am unsure about your patch, I had an internal syzbot report about
> > > this before going OOO for a few days,
> > > and my first reaction was to change the WARN in inet_accept().
> > >
> > > Perhaps some applications are relying on calling shutdown() before listen()...
>
> Uhmm, right I did not consider that a non zero sk_shutdown would have
> affected recvmsg() and sendmsg() even prior to 94062790aedb ("tcp:
> defer shutdown(SEND_SHUTDOWN) for TCP_SYN_RECV sockets").
>
> > BTW the syzbot repro was
> >
> > r0 = socket$inet6_tcp(0xa, 0x1, 0x0)
> > sendto$inet6(0xffffffffffffffff, 0x0, 0x0, 0x20000004, 0x0, 0x0)
> > shutdown(r0, 0x1)
> > bind$inet6(r0, &(0x7f0000000040)={0xa, 0x4e22, 0x0, @empty}, 0x1c)
> > listen(r0, 0x0)
> > r1 = socket$inet_mptcp(0x2, 0x1, 0x106)
> > connect$inet(r1, &(0x7f0000000000)={0x2, 0x4e22, @local}, 0x10)
> > accept(r0, 0x0, 0x0)
>
> The above is very similar to what Christoph reported. It should splat
> even replacing 0x106 with 0 (mptcp -> tcp).
>
> I'm fine with relaxing the check in __inet_accept(). Do you prefer send
> to patch yourself, or me to send a v2? The condition should be
>
> WARN_ON(!((1 << newsk->sk_state) &
> (TCPF_ESTABLISHED | TCPF_SYN_RECV |
> TCPF_FIN_WAIT1 | TCPF_FIN_WAIT2 |
> TCPF_CLOSING | TCPF_CLOSE_WAIT |
> TCPF_CLOSE)));
>
> I guess.
>
> Thanks!
>
> Paolo
>
>
>
Powered by blists - more mailing lists