linux-kernel - Re: [PATCH net v3] vsock/virtio: cap TX credit to local buffer size

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <24b9961d-7e0d-4239-97b3-39799524909f@gmail.com>
Date: Sun, 14 Dec 2025 06:38:22 +0000
From: Melbin K Mathew <mlbnkm1@...il.com>
To: Stefano Garzarella <sgarzare@...hat.com>
Cc: "Michael S. Tsirkin" <mst@...hat.com>, stefanha@...hat.com,
 kvm@...r.kernel.org, netdev@...r.kernel.org, virtualization@...ts.linux.dev,
 linux-kernel@...r.kernel.org, jasowang@...hat.com,
 xuanzhuo@...ux.alibaba.com, eperezma@...hat.com, davem@...emloft.net,
 edumazet@...gle.com, kuba@...nel.org, pabeni@...hat.com, horms@...nel.org
Subject: Re: [PATCH net v3] vsock/virtio: cap TX credit to local buffer size



On 12/12/2025 12:26, Stefano Garzarella wrote:
> On Fri, Dec 12, 2025 at 11:40:03AM +0000, Melbin K Mathew wrote:
>>
>>
>> On 12/12/2025 10:40, Stefano Garzarella wrote:
>>> On Fri, Dec 12, 2025 at 09:56:28AM +0000, Melbin Mathew Antony wrote:
>>>> Hi Stefano, Michael,
>>>>
>>>> Thanks for the suggestions and guidance.
>>>
>>> You're welcome, but please avoid top-posting in the future:
>>> https://www.kernel.org/doc/html/latest/process/submitting- 
>>> patches.html#use-trimmed-interleaved-replies-in-email-discussions
>>>
>> Sure. Thanks
>>>>
>>>> I’ve drafted a 4-part series based on the recap. I’ve included the
>>>> four diffs below for discussion. Can wait for comments, iterate, and
>>>> then send the patch series in a few days.
>>>>
>>>> ---
>>>>
>>>> Patch 1/4 — vsock/virtio: make get_credit() s64-safe and clamp 
>>>> negatives
>>>>
>>>> virtio_transport_get_credit() was doing unsigned arithmetic; if the
>>>> peer shrinks its window, the subtraction can underflow and look like
>>>> “lots of credit”. This makes it compute “space” in s64 and clamp < 0
>>>> to 0.
>>>>
>>>> diff --git a/net/vmw_vsock/virtio_transport_common.c
>>>> b/net/vmw_vsock/virtio_transport_common.c
>>>> --- a/net/vmw_vsock/virtio_transport_common.c
>>>> +++ b/net/vmw_vsock/virtio_transport_common.c
>>>> @@ -494,16 +494,23 @@ 
>>>> EXPORT_SYMBOL_GPL(virtio_transport_consume_skb_sent);
>>>> u32 virtio_transport_get_credit(struct virtio_vsock_sock *vvs, u32 
>>>> credit)
>>>> {
>>>> + s64 bytes;
>>>>  u32 ret;
>>>>
>>>>  if (!credit)
>>>>  return 0;
>>>>
>>>>  spin_lock_bh(&vvs->tx_lock);
>>>> - ret = vvs->peer_buf_alloc - (vvs->tx_cnt - vvs->peer_fwd_cnt);
>>>> - if (ret > credit)
>>>> - ret = credit;
>>>> + bytes = (s64)vvs->peer_buf_alloc -
>>>
>>> Why not just calling virtio_transport_has_space()?
>> virtio_transport_has_space() takes struct vsock_sock *, while 
>> virtio_transport_get_credit() takes struct virtio_vsock_sock *, so I 
>> cannot directly call has_space() from get_credit() without changing 
>> signatures.
>>
>> Would you be OK if I factor the common “space” calculation into a 
>> small helper that operates on struct virtio_vsock_sock * and is used 
>> by both paths? Something like:
> 
> Why not just change the signature of virtio_transport_has_space()?
Thanks, that is cleaner.

For Patch 1 i'll change virtio_transport_has_space() to take
struct virtio_vsock_sock * and call it from both
virtio_transport_stream_has_space() and virtio_transport_get_credit().

/*
  * Return available peer buffer space for TX (>= 0).
  *
  * Use s64 arithmetic so that if the peer shrinks peer_buf_alloc while
  * we have bytes in flight (tx_cnt - peer_fwd_cnt), the subtraction does
  * not underflow into a large positive value as it would with u32.
  *
  * Must be called with vvs->tx_lock held.
  */
static s64 virtio_transport_has_space(struct virtio_vsock_sock *vvs)
{
	s64 bytes;

	bytes = (s64)vvs->peer_buf_alloc -
		((s64)vvs->tx_cnt - (s64)vvs->peer_fwd_cnt);
	if (bytes < 0)
		bytes = 0;

	return bytes;
}

s64 virtio_transport_stream_has_space(struct vsock_sock *vsk)
{
	struct virtio_vsock_sock *vvs = vsk->trans;
	s64 bytes;

	spin_lock_bh(&vvs->tx_lock);
	bytes = virtio_transport_has_space(vvs);
	spin_unlock_bh(&vvs->tx_lock);

	return bytes;
}

u32 virtio_transport_get_credit(struct virtio_vsock_sock *vvs, u32 credit)
{
	u32 ret;

	if (!credit)
		return 0;

	spin_lock_bh(&vvs->tx_lock);
	ret = min_t(u32, credit, (u32)virtio_transport_has_space(vvs));
	vvs->tx_cnt += ret;
	vvs->bytes_unsent += ret;
	spin_unlock_bh(&vvs->tx_lock);

	return ret;
}

Does this look right?
> Thanks,
> Stefano
> 
>>
>> /* Must be called with vvs->tx_lock held. Returns >= 0. */
>> static s64 virtio_transport_tx_space(struct virtio_vsock_sock *vvs)
>> {
>>     s64 bytes;
>>
>>     bytes = (s64)vvs->peer_buf_alloc -
>>         ((s64)vvs->tx_cnt - (s64)vvs->peer_fwd_cnt);
>>     if (bytes < 0)
>>         bytes = 0;
>>
>>     return bytes;
>> }
>>
>> Then:
>>
>> get_credit() would do bytes = virtio_transport_tx_space(vvs); ret = 
>> min_t(u32, credit, (u32)bytes);
>>
>> has_space() would use the same helper after obtaining vvs = vsk->trans;
>>
>> Does that match what you had in mind, or would you prefer a different 
>> factoring?
>>
>>>
>>>> + ((s64)vvs->tx_cnt - (s64)vvs->peer_fwd_cnt);
>>>> + if (bytes < 0)
>>>> + bytes = 0;
>>>> +
>>>> + ret = min_t(u32, credit, (u32)bytes);
>>>>  vvs->tx_cnt += ret;
>>>>  vvs->bytes_unsent += ret;
>>>>  spin_unlock_bh(&vvs->tx_lock);
>>>>
>>>>  return ret;
>>>> }
>>>>
>>>>
>>>> ---
>>>>
>>>> Patch 2/4 — vsock/virtio: cap TX window by local buffer (helper + use
>>>> everywhere in TX path)
>>>>
>>>> Cap the effective advertised window to min(peer_buf_alloc, buf_alloc)
>>>> and use it consistently in TX paths (get_credit, has_space,
>>>> seqpacket_enqueue).
>>>>
>>>> diff --git a/net/vmw_vsock/virtio_transport_common.c
>>>> b/net/vmw_vsock/virtio_transport_common.c
>>>> --- a/net/vmw_vsock/virtio_transport_common.c
>>>> +++ b/net/vmw_vsock/virtio_transport_common.c
>>>> @@ -491,6 +491,16 @@ void virtio_transport_consume_skb_sent(struct
>>>> sk_buff *skb, bool consume)
>>>> }
>>>> EXPORT_SYMBOL_GPL(virtio_transport_consume_skb_sent);
>>>> +/* Return the effective peer buffer size for TX credit computation.
>>>> + *
>>>> + * The peer advertises its receive buffer via peer_buf_alloc, but 
>>>> we cap it
>>>> + * to our local buf_alloc (derived from SO_VM_SOCKETS_BUFFER_SIZE and
>>>> + * already clamped to buffer_max_size).
>>>> + */
>>>> +static u32 virtio_transport_tx_buf_alloc(struct virtio_vsock_sock 
>>>> *vvs)
>>>> +{
>>>> + return min(vvs->peer_buf_alloc, vvs->buf_alloc);
>>>> +}
>>>>
>>>> u32 virtio_transport_get_credit(struct virtio_vsock_sock *vvs, u32 
>>>> credit)
>>>> {
>>>>  s64 bytes;
>>>> @@ -502,7 +512,8 @@ u32 virtio_transport_get_credit(struct
>>>> virtio_vsock_sock *vvs, u32 credit)
>>>>  return 0;
>>>>
>>>>  spin_lock_bh(&vvs->tx_lock);
>>>> - bytes = (s64)vvs->peer_buf_alloc -
>>>> + bytes = (s64)virtio_transport_tx_buf_alloc(vvs) -
>>>>  ((s64)vvs->tx_cnt - (s64)vvs->peer_fwd_cnt);
>>>>  if (bytes < 0)
>>>>  bytes = 0;
>>>> @@ -834,7 +845,7 @@ virtio_transport_seqpacket_enqueue(struct 
>>>> vsock_sock *vsk,
>>>>  spin_lock_bh(&vvs->tx_lock);
>>>>
>>>> - if (len > vvs->peer_buf_alloc) {
>>>> + if (len > virtio_transport_tx_buf_alloc(vvs)) {
>>>>  spin_unlock_bh(&vvs->tx_lock);
>>>>  return -EMSGSIZE;
>>>>  }
>>>> @@ -884,7 +895,8 @@ static s64 virtio_transport_has_space(struct
>>>> vsock_sock *vsk)
>>>>  struct virtio_vsock_sock *vvs = vsk->trans;
>>>>  s64 bytes;
>>>>
>>>> - bytes = (s64)vvs->peer_buf_alloc - (vvs->tx_cnt - vvs->peer_fwd_cnt);
>>>> + bytes = (s64)virtio_transport_tx_buf_alloc(vvs) -
>>>> + ((s64)vvs->tx_cnt - (s64)vvs->peer_fwd_cnt);
>>>>  if (bytes < 0)
>>>>  bytes = 0;
>>>>
>>>>  return bytes;
>>>> }
>>>>
>>>>
>>>> ---
>>>>
>>>> Patch 3/4 — vsock/test: fix seqpacket msg bounds test (set client 
>>>> buf too)
>>>
>>> Please just include in the series the patch I sent to you.
>>>
>> Thanks. I'll use your vsock_test.c patch as-is for 3/4
>>>>
>>>> After fixing TX credit bounds, the client can fill its TX window and
>>>> block before it wakes the server. Setting the buffer on the client
>>>> makes the test deterministic again.
>>>>
>>>> diff --git a/tools/testing/vsock/vsock_test.c b/tools/testing/vsock/ 
>>>> vsock_test.c
>>>> --- a/tools/testing/vsock/vsock_test.c
>>>> +++ b/tools/testing/vsock/vsock_test.c
>>>> @@ -353,6 +353,7 @@ static void test_stream_msg_peek_server(const
>>>> struct test_opts *opts)
>>>>
>>>> static void test_seqpacket_msg_bounds_client(const struct test_opts 
>>>> *opts)
>>>> {
>>>> + unsigned long long sock_buf_size;
>>>>  unsigned long curr_hash;
>>>>  size_t max_msg_size;
>>>>  int page_size;
>>>> @@ -366,6 +367,18 @@ static void
>>>> test_seqpacket_msg_bounds_client(const struct test_opts *opts)
>>>>  exit(EXIT_FAILURE);
>>>>  }
>>>>
>>>> + sock_buf_size = SOCK_BUF_SIZE;
>>>> +
>>>> + setsockopt_ull_check(fd, AF_VSOCK, SO_VM_SOCKETS_BUFFER_MAX_SIZE,
>>>> +    sock_buf_size,
>>>> +    "setsockopt(SO_VM_SOCKETS_BUFFER_MAX_SIZE)");
>>>> +
>>>> + setsockopt_ull_check(fd, AF_VSOCK, SO_VM_SOCKETS_BUFFER_SIZE,
>>>> +    sock_buf_size,
>>>> +    "setsockopt(SO_VM_SOCKETS_BUFFER_SIZE)");
>>>> +
>>>>  /* Wait, until receiver sets buffer size. */
>>>>  control_expectln("SRVREADY");
>>>>
>>>>
>>>> ---
>>>>
>>>> Patch 4/4 — vsock/test: add stream TX credit bounds regression test
>>>>
>>>> This directly guards the original failure mode for stream sockets: if
>>>> the peer advertises a large window but the sender’s local policy is
>>>> small, the sender must stall quickly (hit EAGAIN in nonblocking mode)
>>>> rather than queueing megabytes.
>>>
>>> Yeah, using nonblocking mode LGTM!
>>>
>>>>
>>>> diff --git a/tools/testing/vsock/vsock_test.c b/tools/testing/vsock/ 
>>>> vsock_test.c
>>>> --- a/tools/testing/vsock/vsock_test.c
>>>> +++ b/tools/testing/vsock/vsock_test.c
>>>> @@ -349,6 +349,7 @@
>>>> #define SOCK_BUF_SIZE (2 * 1024 * 1024)
>>>> +#define SMALL_SOCK_BUF_SIZE (64 * 1024ULL)
>>>> #define MAX_MSG_PAGES 4
>>>>
>>>> /* Insert new test functions after test_stream_msg_peek_server, before
>>>>  * test_seqpacket_msg_bounds_client (around line 352) */
>>>>
>>>> +static void test_stream_tx_credit_bounds_client(const struct 
>>>> test_opts *opts)
>>>> +{
>>>> + ... /* full function as provided */
>>>> +}
>>>> +
>>>> +static void test_stream_tx_credit_bounds_server(const struct 
>>>> test_opts *opts)
>>>> +{
>>>> + ... /* full function as provided */
>>>> +}
>>>>
>>>> @@ -2224,6 +2305,10 @@
>>>>  .run_client = test_stream_msg_peek_client,
>>>>  .run_server = test_stream_msg_peek_server,
>>>>  },
>>>> + {
>>>> + .name = "SOCK_STREAM TX credit bounds",
>>>> + .run_client = test_stream_tx_credit_bounds_client,
>>>> + .run_server = test_stream_tx_credit_bounds_server,
>>>> + },
>>>
>>> Please put it at the bottom. Tests are skipped by index, so we don't 
>>> want to change index of old tests.
>>>
>>> Please fix your editor, those diffs are hard to read without tabs/ 
>>> spaces.
>> seems like some issue with my email client. Hope it is okay now
>>>
>>> Thanks,
>>> Stefano
>>>
>>
>