netdev - Re: [PATCH net 2/2] vsock/test: Add test for SO

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <cd7chdxitqx7pvusgt45p7s4s4cddyloqog2koases4ocvpayg@ryndsxdgm5ul>
Date: Tue, 1 Apr 2025 12:32:42 +0200
From: Stefano Garzarella <sgarzare@...hat.com>
To: Michal Luczaj <mhal@...x.co>
Cc: "David S. Miller" <davem@...emloft.net>, 
	Eric Dumazet <edumazet@...gle.com>, Jakub Kicinski <kuba@...nel.org>, 
	Paolo Abeni <pabeni@...hat.com>, Simon Horman <horms@...nel.org>, netdev@...r.kernel.org
Subject: Re: [PATCH net 2/2] vsock/test: Add test for SO_LINGER null ptr deref

On Tue, Mar 25, 2025 at 02:22:45PM +0100, Michal Luczaj wrote:
>On 3/20/25 12:31, Stefano Garzarella wrote:
>> On Fri, Mar 14, 2025 at 04:25:16PM +0100, Michal Luczaj wrote:
>>> On 3/10/25 16:24, Stefano Garzarella wrote:
>>>> On Fri, Mar 07, 2025 at 10:49:52AM +0100, Michal Luczaj wrote:
>>>>> ...
>>>>> I've tried modifying the loop to make close()/shutdown() linger until
>>>>> unsent_bytes() == 0. No idea if this is acceptable:
>>>>
>>>> Yes, that's a good idea, I had something similar in mind, but reusing
>>>> unsent_bytes() sounds great to me.
>>>>
>>>> The only problem I see is that in the driver in the guest, the packets
>>>> are put in the virtqueue and the variable is decremented only when the
>>>> host sends us an interrupt to say that it has copied the packets and
>>>> then the guest can free the buffer. Is this okay to consider this as
>>>> sending?
>>>>
>>>> I think so, though it's honestly not clear to me if instead by sending
>>>> we should consider when the driver copies the bytes into the virtqueue,
>>>> but that doesn't mean they were really sent. We should compare it to
>>>> what the network devices or AF_UNIX do.
>>>
>>> I had a look at AF_UNIX. SO_LINGER is not supported. Which makes sense;
>>> when you send a packet, it directly lands in receiver's queue. As for
>>> SIOCOUTQ handling: `return sk_wmem_alloc_get(sk)`. So I guess it's more of
>>> an "unread bytes"?
>>
>> Yes, I see, actually for AF_UNIX it is simple.
>> It's hard for us to tell when the user on the other pear actually read
>> the data, we could use the credit mechanism, but that sometimes isn't
>> sent unless explicitly requested, so I'd say unsent_bytes() is fine.
>
>One more option: keep the semantics (in a state of not-what-`man 7 socket`-
>says) and, for completeness, add the lingering to shutdown()?

Sorry, I'm getting lost!
That's because we had a different behavior between close() and 
shutdown() right?

If it's the case, I would say let's fix at least that for now.

>
>>>>> ...
>>>>> This works, but I find it difficult to test without artificially slowing
>>>>> the kernel down. It's a race against workers as they quite eagerly do
>>>>> virtio_transport_consume_skb_sent(), which decrements vvs->bytes_unsent.
>>>>> I've tried reducing SO_VM_SOCKETS_BUFFER_SIZE as you've suggested, but
>>>>> send() would just block until peer had available space.
>>>>
>>>> Did you test with loopback or virtio-vsock with a VM?
>>>
>>> Both, but I may be missing something. Do you see a way to stop (or don't
>>> schedule) the worker from processing queue (and decrementing bytes_unsent)?
>>
>> Without touching the driver (which I don't want to do) I can't think of
>> anything, so I'd say it's okay.
>
>Turns out there's a way to purge the loopback queue before worker processes
>it (I had no success with g2h). If you win that race, bytes_unsent stays
>elevated until kingdom come. Then you can close() the socket and watch as
>it lingers.
>
>connect(s)
>  lock_sock
>  while (sk_state != TCP_ESTABLISHED)
>    release_sock
>    schedule_timeout
>
>// virtio_transport_recv_connecting
>//   sk_state = TCP_ESTABLISHED
>
>                                       send(s, 'x')
>                                         lock_sock
>                                         virtio_transport_send_pkt_info
>                                           virtio_transport_get_credit
>                                    (!)      vvs->bytes_unsent += ret
>                                           vsock_loopback_send_pkt
>                                             virtio_vsock_skb_queue_tail
>                                         release_sock
>                                       kill()
>    lock_sock
>    if signal_pending
>      vsock_loopback_cancel_pkt
>        virtio_transport_purge_skbs (!)
>
>That said, I may be missing a bigger picture, but is it worth supporting
>this "signal disconnects TCP_ESTABLISHED" behaviour in the first place?

Can you elaborate a bit?

>Removing it would make the race above (and the whole [1] series) moot.
>Plus, it appears to be broken: when I hit this condition and I try to
>re-connect to the same listener, I get ETIMEDOUT for loopback and
>ECONNRESET for g2h virtio; see [2].

Could this be related to the fix I sent some days ago?
https://lore.kernel.org/netdev/20250328141528.420719-1-sgarzare@redhat.com/

Thanks,
Stefano