lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <ZsaLFVB0HyQfXBXy@pop-os.localdomain>
Date: Wed, 21 Aug 2024 17:49:25 -0700
From: Cong Wang <xiyou.wangcong@...il.com>
To: Simon Horman <horms@...nel.org>
Cc: netdev@...r.kernel.org, bpf@...r.kernel.org,
	Cong Wang <cong.wang@...edance.com>,
	syzbot+58c03971700330ce14d8@...kaller.appspotmail.com,
	John Fastabend <john.fastabend@...il.com>,
	Jakub Sitnicki <jakub@...udflare.com>
Subject: Re: [Patch bpf] tcp_bpf: fix return value of tcp_bpf_sendmsg()

On Wed, Aug 21, 2024 at 03:55:33PM +0100, Simon Horman wrote:
> On Tue, Aug 20, 2024 at 08:07:44PM -0700, Cong Wang wrote:
> > From: Cong Wang <cong.wang@...edance.com>
> > 
> > When we cork messages in psock->cork, the last message triggers the
> > flushing will result in sending a sk_msg larger than the current
> > message size. In this case, in tcp_bpf_send_verdict(), 'copied' becomes
> > negative at least in the following case:
> > 
> > 468         case __SK_DROP:
> > 469         default:
> > 470                 sk_msg_free_partial(sk, msg, tosend);
> > 471                 sk_msg_apply_bytes(psock, tosend);
> > 472                 *copied -= (tosend + delta); // <==== HERE
> > 473                 return -EACCES;
> > 
> > Therefore, it could lead to the following BUG with a proper value of
> > 'copied' (thanks to syzbot). We should not use negative 'copied' as a
> > return value here.
> > 
> >   ------------[ cut here ]------------
> >   kernel BUG at net/socket.c:733!
> >   Internal error: Oops - BUG: 00000000f2000800 [#1] PREEMPT SMP
> >   Modules linked in:
> >   CPU: 0 UID: 0 PID: 3265 Comm: syz-executor510 Not tainted 6.11.0-rc3-syzkaller-00060-gd07b43284ab3 #0
> >   Hardware name: linux,dummy-virt (DT)
> >   pstate: 61400009 (nZCv daif +PAN -UAO -TCO +DIT -SSBS BTYPE=--)
> >   pc : sock_sendmsg_nosec net/socket.c:733 [inline]
> >   pc : sock_sendmsg_nosec net/socket.c:728 [inline]
> >   pc : __sock_sendmsg+0x5c/0x60 net/socket.c:745
> >   lr : sock_sendmsg_nosec net/socket.c:730 [inline]
> >   lr : __sock_sendmsg+0x54/0x60 net/socket.c:745
> >   sp : ffff800088ea3b30
> >   x29: ffff800088ea3b30 x28: fbf00000062bc900 x27: 0000000000000000
> >   x26: ffff800088ea3bc0 x25: ffff800088ea3bc0 x24: 0000000000000000
> >   x23: f9f00000048dc000 x22: 0000000000000000 x21: ffff800088ea3d90
> >   x20: f9f00000048dc000 x19: ffff800088ea3d90 x18: 0000000000000001
> >   x17: 0000000000000000 x16: 0000000000000000 x15: 000000002002ffaf
> >   x14: 0000000000000000 x13: 0000000000000000 x12: 0000000000000000
> >   x11: 0000000000000000 x10: ffff8000815849c0 x9 : ffff8000815b49c0
> >   x8 : 0000000000000000 x7 : 000000000000003f x6 : 0000000000000000
> >   x5 : 00000000000007e0 x4 : fff07ffffd239000 x3 : fbf00000062bc900
> >   x2 : 0000000000000000 x1 : 0000000000000000 x0 : 00000000fffffdef
> >   Call trace:
> >    sock_sendmsg_nosec net/socket.c:733 [inline]
> >    __sock_sendmsg+0x5c/0x60 net/socket.c:745
> >    ____sys_sendmsg+0x274/0x2ac net/socket.c:2597
> >    ___sys_sendmsg+0xac/0x100 net/socket.c:2651
> >    __sys_sendmsg+0x84/0xe0 net/socket.c:2680
> >    __do_sys_sendmsg net/socket.c:2689 [inline]
> >    __se_sys_sendmsg net/socket.c:2687 [inline]
> >    __arm64_sys_sendmsg+0x24/0x30 net/socket.c:2687
> >    __invoke_syscall arch/arm64/kernel/syscall.c:35 [inline]
> >    invoke_syscall+0x48/0x110 arch/arm64/kernel/syscall.c:49
> >    el0_svc_common.constprop.0+0x40/0xe0 arch/arm64/kernel/syscall.c:132
> >    do_el0_svc+0x1c/0x28 arch/arm64/kernel/syscall.c:151
> >    el0_svc+0x34/0xec arch/arm64/kernel/entry-common.c:712
> >    el0t_64_sync_handler+0x100/0x12c arch/arm64/kernel/entry-common.c:730
> >    el0t_64_sync+0x19c/0x1a0 arch/arm64/kernel/entry.S:598
> >   Code: f9404463 d63f0060 3108441f 54fffe81 (d4210000)
> >   ---[ end trace 0000000000000000 ]---
> > 
> > Fixes: 4f738adba30a ("bpf: create tcp_bpf_ulp allowing BPF to monitor socket TX/RX data")
> > Reported-by: syzbot+58c03971700330ce14d8@...kaller.appspotmail.com
> > Cc: John Fastabend <john.fastabend@...il.com>
> > Cc: Jakub Sitnicki <jakub@...udflare.com>
> > Signed-off-by: Cong Wang <cong.wang@...edance.com>
> > ---
> >  net/ipv4/tcp_bpf.c | 2 +-
> >  1 file changed, 1 insertion(+), 1 deletion(-)
> > 
> > diff --git a/net/ipv4/tcp_bpf.c b/net/ipv4/tcp_bpf.c
> > index 53b0d62fd2c2..fe6178715ba0 100644
> > --- a/net/ipv4/tcp_bpf.c
> > +++ b/net/ipv4/tcp_bpf.c
> > @@ -577,7 +577,7 @@ static int tcp_bpf_sendmsg(struct sock *sk, struct msghdr *msg, size_t size)
> >  		err = sk_stream_error(sk, msg->msg_flags, err);
> >  	release_sock(sk);
> >  	sk_psock_put(sk, psock);
> > -	return copied ? copied : err;
> > +	return copied > 0 ? copied : err;
> 
> Does it make more sense to make the condition err:
> is err 0 iif everything is ok? (completely untested!)

Mind to elaborate?

>From my point of view, 'copied' is to handle partial transmission, for
example:

0. User wants to send 2 * 1K bytes with sendmsg()
1. Kernel already sent the first 1K successfully
2. Kernel got some error when sending the 2nd 1K

In this scenario, we should return 1K instead of the error to the caller to
indicate this partial transmission situation, otherwise we could not
distinguish it with a compete failure (that is, 0 byte sent).

Do I miss anything?

Thanks.

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ