lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Message-ID: <g7y5kd2lkzkcklixeuyhq5rf6ijruicizcilakjl5uueb7ye2b@xu2kkhwbgv5w>
Date: Sat, 25 Jan 2025 14:51:49 +0800
From: Jiayuan Chen <mrpre@....com>
To: John Fastabend <john.fastabend@...il.com>
Cc: bpf@...r.kernel.org, borisp@...dia.com, kuba@...nel.org, 
	davem@...emloft.net, edumazet@...gle.com, pabeni@...hat.com, horms@...nel.org, 
	andrii@...nel.org, eddyz87@...il.com, mykolal@...com, ast@...nel.org, 
	daniel@...earbox.net, martin.lau@...ux.dev, song@...nel.org, yonghong.song@...ux.dev, 
	kpsingh@...nel.org, sdf@...ichev.me, haoluo@...gle.com, jolsa@...nel.org, 
	shuah@...nel.org, netdev@...r.kernel.org
Subject: Re: [PATCH bpf v1 1/2] bpf: fix ktls panic

On Fri, Jan 24, 2025 at 09:24:48PM -0800, John Fastabend wrote:
> On 2025-01-24 01:15:51, Jiayuan Chen wrote:
> > [ 2172.936997] ------------[ cut here ]------------
> > [ 2172.936999] kernel BUG at lib/iov_iter.c:629!
> > ......
> > pointless and can directly go to zero-copy logic.
> > 
> > 2. Suppose sg.size is initially 5, and we push it to 100, setting
> > apply_bytes to 7. Then, 98 bytes of data are sent out, leaving 2 bytes to
> > be processed. The rollback logic cannot determine which data has been
> > processed and which hasn't.
> 
> This is the error path we are talking about correct?
> 
>         if (msg->cork_bytes && msg->cork_bytes > msg->sg.size &&
>             !enospc && !full_record) {
>                 err = -ENOSPC;
>                 goto out_err;
>         }
> 
yes, it its.
> > 
> > diff --git a/net/tls/tls_sw.c b/net/tls/tls_sw.c
> > index 7bcc9b4408a2..b3cae4dd4f49 100644
> > --- a/net/tls/tls_sw.c
> > +++ b/net/tls/tls_sw.c
> > @@ -1120,9 +1120,13 @@ static int tls_sw_sendmsg_locked(struct sock *sk, struct msghdr *msg,
> >  					num_async++;
> >  				else if (ret == -ENOMEM)
> >  					goto wait_for_memory;
> > -				else if (ctx->open_rec && ret == -ENOSPC)
> > +				else if (ctx->open_rec && ret == -ENOSPC) {
> > +					if (msg_pl->cork_bytes) {
> > +						ret = 0;
> > +						goto send_end;
> > +					}
> 
> The app will lose bytes here I suspect if we return copied == try_to_copy then
> no error makes it to the user?

I looked into the corking logic for non-TLS sockets in tcp_bpf_sendmsg,
and I found that when a "cork" situation occurs, the user-space send
doesn't return an error, and the returned length is the same as the input
length parameter, even if some data is cached. I think TLS should also
behave similarly.

Additionally, I saw that the current non-zero-copy logic for handling
corking is written as:
'''
line 1177

else if (ret != -EAGAIN) {
	if (ret == -ENOSPC)
		ret = 0;
	goto send_end;
}
'''

Meanwhile, I set cork_bytes to 1 and tested the following behavior logic:
'''
send(msg, 1);
send(msg+1, 1);
send(msg+2, remain_length);
'''
Both the sender and receiver seem to be working normally both for TLS and
non-TLS sockets.
 
> Could we return delta from bpf_exec_tx_verdict and then we can calculate
> the correct number of bytes to revert? I'll need to check but its not
> clear to me if BPF program pushes data that the right thing is done with
> delta there now.
> 
> Thanks for looking into this.

Let's assume the original data is "abcdefgh" (8 bytes), and after 3 pushes
by the BPF program, it becomes 11-byte data: "abc?de?fgh?".

Then, we set cork_bytes to 6, which means the first 6 bytes have been
processed, and the remaining 5 bytes "?fgh?" will be cached until the
length meets the cork_bytes requirement.

However, some data in "?fgh?" is not within 'sg->msg_iter'
(but in msg_pl instead), especially the data "?" we pushed.

So it doesn't seem as simple as just reverting through an offset of msg_iter.
It appears that 'msg_iter' and 'msg_pl' are two separate objects,
and the BPF program modifies the scatterlist within the msg_pl.

--
Thanks.


Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ