netdev - Re: [PATCH net 2/2] vsock/test: Add test for SO

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <hkhwrfz4dzhaco4mb25st5zyfybimchac3zcqsgzmtim53sq5o@o4u6privahp3>
Date: Thu, 20 Mar 2025 12:31:01 +0100
From: Stefano Garzarella <sgarzare@...hat.com>
To: Michal Luczaj <mhal@...x.co>
Cc: "David S. Miller" <davem@...emloft.net>, 
	Eric Dumazet <edumazet@...gle.com>, Jakub Kicinski <kuba@...nel.org>, 
	Paolo Abeni <pabeni@...hat.com>, Simon Horman <horms@...nel.org>, netdev@...r.kernel.org
Subject: Re: [PATCH net 2/2] vsock/test: Add test for SO_LINGER null ptr deref

On Fri, Mar 14, 2025 at 04:25:16PM +0100, Michal Luczaj wrote:
>On 3/10/25 16:24, Stefano Garzarella wrote:
>> On Fri, Mar 07, 2025 at 10:49:52AM +0100, Michal Luczaj wrote:
>>> ...
>>> I've tried modifying the loop to make close()/shutdown() linger until
>>> unsent_bytes() == 0. No idea if this is acceptable:
>>
>> Yes, that's a good idea, I had something similar in mind, but reusing
>> unsent_bytes() sounds great to me.
>>
>> The only problem I see is that in the driver in the guest, the packets
>> are put in the virtqueue and the variable is decremented only when the
>> host sends us an interrupt to say that it has copied the packets and
>> then the guest can free the buffer. Is this okay to consider this as
>> sending?
>>
>> I think so, though it's honestly not clear to me if instead by sending
>> we should consider when the driver copies the bytes into the virtqueue,
>> but that doesn't mean they were really sent. We should compare it to
>> what the network devices or AF_UNIX do.
>
>I had a look at AF_UNIX. SO_LINGER is not supported. Which makes sense;
>when you send a packet, it directly lands in receiver's queue. As for
>SIOCOUTQ handling: `return sk_wmem_alloc_get(sk)`. So I guess it's more of
>an "unread bytes"?

Yes, I see, actually for AF_UNIX it is simple.
It's hard for us to tell when the user on the other pear actually read
the data, we could use the credit mechanism, but that sometimes isn't
sent unless explicitly requested, so I'd say unsent_bytes() is fine.

>
>>> +void vsock_linger(struct sock *sk, long timeout)
>>> +{
>>> +	if (timeout) {
>>> +		DEFINE_WAIT_FUNC(wait, woken_wake_function);
>>> +		ssize_t (*unsent)(struct vsock_sock *vsk);
>>> +		struct vsock_sock *vsk = vsock_sk(sk);
>>> +
>>> +		add_wait_queue(sk_sleep(sk), &wait);
>>> +		unsent = vsk->transport->unsent_bytes;
>>
>> This is not implemented by all transports, so we should handle it in
>> some way (check the pointer or implement in all transports).
>
>Ah, right, I didn't think that through.
>
>>> @@ -1056,6 +1075,8 @@ static int vsock_shutdown(struct socket *sock, int mode)
>>> 		if (sock_type_connectible(sk->sk_type)) {
>>> 			sock_reset_flag(sk, SOCK_DONE);
>>> 			vsock_send_shutdown(sk, mode);
>>> +			if (sock_flag(sk, SOCK_LINGER))
>>> +				vsock_linger(sk, sk->sk_lingertime);
>>
>> mmm, great, so on shutdown we never supported SO_LINGER, right?
>
>Yup.
>
>>> ...
>>> This works, but I find it difficult to test without artificially slowing
>>> the kernel down. It's a race against workers as they quite eagerly do
>>> virtio_transport_consume_skb_sent(), which decrements vvs->bytes_unsent.
>>> I've tried reducing SO_VM_SOCKETS_BUFFER_SIZE as you've suggested, but
>>> send() would just block until peer had available space.
>>
>> Did you test with loopback or virtio-vsock with a VM?
>
>Both, but I may be missing something. Do you see a way to stop (or don't
>schedule) the worker from processing queue (and decrementing bytes_unsent)?

Without touching the driver (which I don't want to do) I can't think of
anything, so I'd say it's okay.

Thanks,
Stefano