lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <45cd36ee-ef99-fb67-df8d-218603facfd7@gmail.com>
Date:   Fri, 8 Jul 2022 15:03:46 +0100
From:   Pavel Begunkov <asml.silence@...il.com>
To:     David Ahern <dsahern@...nel.org>, io-uring@...r.kernel.org,
        netdev@...r.kernel.org, linux-kernel@...r.kernel.org
Cc:     "David S . Miller" <davem@...emloft.net>,
        Jakub Kicinski <kuba@...nel.org>,
        Jonathan Lemon <jonathan.lemon@...il.com>,
        Willem de Bruijn <willemb@...gle.com>,
        Jens Axboe <axboe@...nel.dk>, kernel-team@...com
Subject: Re: [PATCH net-next v4 11/27] tcp: support externally provided ubufs

On 7/8/22 05:06, David Ahern wrote:
> On 7/7/22 5:49 AM, Pavel Begunkov wrote:
>> Teach ipv4/udp how to use external ubuf_info provided in msghdr and
> 
> ipv4/tcp?

Ehh, just tcp. Thanks, I updated the branches


>> also prepare it for managed frags by sprinkling
>> skb_zcopy_downgrade_managed() when it could mix managed and not managed
>> frags.
>>
>> Signed-off-by: Pavel Begunkov <asml.silence@...il.com>
>> ---
>>   net/ipv4/tcp.c | 32 ++++++++++++++++++++------------
>>   1 file changed, 20 insertions(+), 12 deletions(-)
>>
>> diff --git a/net/ipv4/tcp.c b/net/ipv4/tcp.c
>> index 390eb3dc53bd..a81f694af5e9 100644
>> --- a/net/ipv4/tcp.c
>> +++ b/net/ipv4/tcp.c
>> @@ -1223,17 +1223,23 @@ int tcp_sendmsg_locked(struct sock *sk, struct msghdr *msg, size_t size)
>>   
>>   	flags = msg->msg_flags;
>>   
>> -	if (flags & MSG_ZEROCOPY && size && sock_flag(sk, SOCK_ZEROCOPY)) {
>> +	if ((flags & MSG_ZEROCOPY) && size) {
>>   		skb = tcp_write_queue_tail(sk);
>> -		uarg = msg_zerocopy_realloc(sk, size, skb_zcopy(skb));
>> -		if (!uarg) {
>> -			err = -ENOBUFS;
>> -			goto out_err;
>> -		}
>>   
>> -		zc = sk->sk_route_caps & NETIF_F_SG;
>> -		if (!zc)
>> -			uarg->zerocopy = 0;
>> +		if (msg->msg_ubuf) {
>> +			uarg = msg->msg_ubuf;
>> +			net_zcopy_get(uarg);
>> +			zc = sk->sk_route_caps & NETIF_F_SG;
>> +		} else if (sock_flag(sk, SOCK_ZEROCOPY)) {
>> +			uarg = msg_zerocopy_realloc(sk, size, skb_zcopy(skb));
>> +			if (!uarg) {
>> +				err = -ENOBUFS;
>> +				goto out_err;
>> +			}
>> +			zc = sk->sk_route_caps & NETIF_F_SG;
>> +			if (!zc)
>> +				uarg->zerocopy = 0;
>> +		}
>>   	}
>>   
>>   	if (unlikely(flags & MSG_FASTOPEN || inet_sk(sk)->defer_connect) &&
>> @@ -1356,9 +1362,11 @@ int tcp_sendmsg_locked(struct sock *sk, struct msghdr *msg, size_t size)
>>   
>>   			copy = min_t(int, copy, pfrag->size - pfrag->offset);
>>   
>> -			if (tcp_downgrade_zcopy_pure(sk, skb))
>> -				goto wait_for_space;
>> -
>> +			if (unlikely(skb_zcopy_pure(skb) || skb_zcopy_managed(skb))) {
>> +				if (tcp_downgrade_zcopy_pure(sk, skb))
>> +					goto wait_for_space;
>> +				skb_zcopy_downgrade_managed(skb);
>> +			}
>>   			copy = tcp_wmem_schedule(sk, copy);
>>   			if (!copy)
>>   				goto wait_for_space;
> 
> You dropped the msg->msg_ubuf checks on jump labels. Removing the one
> you had at 'out_nopush' I agree with based on my tests (i.e, it is not
> needed).

It was an optimisation, which I dropped for simplicity. Will be sending it
and couple more afterwards.


> The one at 'out_err' seems like it is needed - but it has been a few
> weeks since I debugged that case. I believe the error path I was hitting
> was sk_stream_wait_memory with MSG_DONTWAIT flag set meaning timeout is
> 0 and it jumps there with EPIPE.

Currently, it's consistent with MSG_ZEROCOPY ubuf_info, we grab a ubuf_info
reference at the beginning (msg_zerocopy_realloc() for MSG_ZEROCOPY and
net_zcopy_get() for msg_ubuf), and then release it at the end
with net_zcopy_put() or net_zcopy_put_abort().

All users, e.g. skb_zerocopy_iter_stream(), have to grab a new reference,
skb_zcopy_set() -> net_zcopy_get().

Not sure I see any issue, and if there is it sounds that it should also
be affecting MSG_ZEROCOPY.


-- 
Pavel Begunkov

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ