[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Message-ID: <a21e5d42-5718-4633-b812-be47ec6acf65@redhat.com>
Date: Thu, 26 Jun 2025 10:31:09 +0200
From: Paolo Abeni <pabeni@...hat.com>
To: Feng Yang <yangfeng59949@....com>, stfomichev@...il.com
Cc: aleksander.lobakin@...el.com, almasrymina@...gle.com,
asml.silence@...il.com, davem@...emloft.net, ebiggers@...gle.com,
edumazet@...gle.com, horms@...nel.org, kerneljasonxing@...il.com,
kuba@...nel.org, linux-kernel@...r.kernel.org, netdev@...r.kernel.org,
willemb@...gle.com, yangfeng@...inos.cn
Subject: Re: [PATCH] skbuff: Improve the sending efficiency of __skb_send_sock
On 6/26/25 9:50 AM, Feng Yang wrote:
> On Wed, 25 Jun 2025 11:35:55 -0700, Stanislav Fomichev <stfomichev@...il.com> wrote:
>> On 06/23, Feng Yang wrote:
>>> From: Feng Yang <yangfeng@...inos.cn>
>>>
>>> By aggregating skb data into a bvec array for transmission, when using sockmap to forward large packets,
>>> what previously required multiple transmissions now only needs a single transmission, which significantly enhances performance.
>>> For small packets, the performance remains comparable to the original level.
>>>
>>> When using sockmap for forwarding, the average latency for different packet sizes
>>> after sending 10,000 packets is as follows:
>>> size old(us) new(us)
>>> 512 56 55
>>> 1472 58 58
>>> 1600 106 79
>>> 3000 145 108
>>> 5000 182 123
>>>
>>> Signed-off-by: Feng Yang <yangfeng@...inos.cn>
>>> ---
>>> net/core/skbuff.c | 112 +++++++++++++++++++++-------------------------
>>> 1 file changed, 52 insertions(+), 60 deletions(-)
>>>
>>> diff --git a/net/core/skbuff.c b/net/core/skbuff.c
>>> index 85fc82f72d26..664443fc9baf 100644
>>> --- a/net/core/skbuff.c
>>> +++ b/net/core/skbuff.c
>>> @@ -3235,84 +3235,75 @@ typedef int (*sendmsg_func)(struct sock *sk, struct msghdr *msg);
>>> static int __skb_send_sock(struct sock *sk, struct sk_buff *skb, int offset,
>>> int len, sendmsg_func sendmsg, int flags)
>>> {
>>> - unsigned int orig_len = len;
>>> struct sk_buff *head = skb;
>>> unsigned short fragidx;
>>> - int slen, ret;
>>> + struct msghdr msg;
>>> + struct bio_vec *bvec;
>>> + int max_vecs, ret, slen;
>>> + int bvec_count = 0;
>>> + unsigned int copied = 0;
>>>
>>> -do_frag_list:
>>> -
>>> - /* Deal with head data */
>>> - while (offset < skb_headlen(skb) && len) {
>>> - struct kvec kv;
>>> - struct msghdr msg;
>>> -
>>> - slen = min_t(int, len, skb_headlen(skb) - offset);
>>> - kv.iov_base = skb->data + offset;
>>> - kv.iov_len = slen;
>>> - memset(&msg, 0, sizeof(msg));
>>> - msg.msg_flags = MSG_DONTWAIT | flags;
>>> -
>>> - iov_iter_kvec(&msg.msg_iter, ITER_SOURCE, &kv, 1, slen);
>>> - ret = INDIRECT_CALL_2(sendmsg, sendmsg_locked,
>>> - sendmsg_unlocked, sk, &msg);
>>> - if (ret <= 0)
>>> - goto error;
>>> + max_vecs = skb_shinfo(skb)->nr_frags + 1; // +1 for linear data
>>> + if (skb_has_frag_list(skb)) {
>>> + struct sk_buff *frag_skb = skb_shinfo(skb)->frag_list;
>>>
>>> - offset += ret;
>>> - len -= ret;
>>> + while (frag_skb) {
>>> + max_vecs += skb_shinfo(frag_skb)->nr_frags + 1; // +1 for linear data
>>> + frag_skb = frag_skb->next;
>>> + }
>>> }
>>>
>>> - /* All the data was skb head? */
>>> - if (!len)
>>> - goto out;
>>> + bvec = kcalloc(max_vecs, sizeof(struct bio_vec), GFP_KERNEL);
>>> + if (!bvec)
>>> + return -ENOMEM;
>>
>> Not sure allocating memory here is a good idea. From what I can tell
>> this function is used by non-sockmap callers as well..
Adding a per packet allocation and a free is IMHO a no-go for a patch
intended to improve performances.
> Alternatively, we can use struct bio_vec bvec[size] to avoid memory allocation.
If you mean using a fixed size bio vec allocated on the stack, that
could work...
> Even if the "size" is insufficient, the unsent portion will be transmitted in the next call to `__skb_send_sock`.
... but I think this part is not acceptable, the callers may/should
already assume that partial transmissions are due to errors.
Instead I think you should loop, batching bio_vec_size tx each loop.
Side note: the patch has a few style issues:
- it should not use // for comments
- variable declaration should respect the reverse christmas tree order
and possibly you could use this refactoring to avoid the use backward
goto statement.
Thanks,
Paolo
Powered by blists - more mailing lists