lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Date: Wed, 28 Jun 2023 11:41:45 +0800
From: Heng Qi <hengqi@...ux.alibaba.com>
To: Jason Wang <jasowang@...hat.com>
Cc: netdev@...r.kernel.org, bpf@...r.kernel.org,
 "Michael S . Tsirkin" <mst@...hat.com>,
 "David S . Miller" <davem@...emloft.net>, Eric Dumazet
 <edumazet@...gle.com>, Jakub Kicinski <kuba@...nel.org>,
 Paolo Abeni <pabeni@...hat.com>, Alexei Starovoitov <ast@...nel.org>,
 Daniel Borkmann <daniel@...earbox.net>,
 Jesper Dangaard Brouer <hawk@...nel.org>,
 John Fastabend <john.fastabend@...il.com>,
 Xuan Zhuo <xuanzhuo@...ux.alibaba.com>
Subject: Re: [PATCH net-next v4 1/2] virtio-net: support coexistence of XDP
 and GUEST_CSUM



在 2023/6/28 上午11:22, Jason Wang 写道:
> On Wed, Jun 28, 2023 at 11:05 AM Heng Qi <hengqi@...ux.alibaba.com> wrote:
>> We are now re-probing the csum related fields and trying
>> to have XDP and RX hw checksum capabilities coexist on the
>> XDP path. For the benefit of:
>> 1. RX hw checksum capability can be used if XDP is loaded.
>> 2. Avoid packet loss when loading XDP in the vm-vm scenario.
>>
>> Signed-off-by: Heng Qi <hengqi@...ux.alibaba.com>
>> Reviewed-by: Xuan Zhuo <xuanzhuo@...ux.alibaba.com>
>> ---
>> v3->v4:
>>    - Rewrite some comments.
>>
>> v2->v3:
>>    - Use skb_checksum_setup() instead of virtnet_flow_dissect_udp_tcp().
>>      Essentially equivalent.
>>
>>   drivers/net/virtio_net.c | 82 +++++++++++++++++++++++++++++++++-------
>>   1 file changed, 69 insertions(+), 13 deletions(-)
>>
>> diff --git a/drivers/net/virtio_net.c b/drivers/net/virtio_net.c
>> index 5a7f7a76b920..a47342f972b5 100644
>> --- a/drivers/net/virtio_net.c
>> +++ b/drivers/net/virtio_net.c
>> @@ -1568,6 +1568,41 @@ static void virtio_skb_set_hash(const struct virtio_net_hdr_v1_hash *hdr_hash,
>>          skb_set_hash(skb, __le32_to_cpu(hdr_hash->hash_value), rss_hash_type);
>>   }
>>
>> +static int virtnet_set_csum_after_xdp(struct virtnet_info *vi,
>> +                                     struct sk_buff *skb,
>> +                                     __u8 flags)
>> +{
>> +       int err = 0;
>> +
>> +       /* When XDP program is loaded, the vm-vm scenario on the same host,
>> +        * packets marked VIRTIO_NET_HDR_F_NEEDS_CSUM without a complete checksum
>> +        * will travel. Although these packets are safe from the point of
>> +        * view of the vm, in order to be successfully forwarded on the upper
>> +        * layer and to avoid packet loss caused by XDP modification,
>> +        * we re-probe the necessary checksum related information:
>> +        * skb->csum_{start, offset}, pseudo-header checksum.
>> +        *
>> +        * If the received packet is marked VIRTIO_NET_HDR_F_DATA_VALID:
>> +        * when _F_GUEST_CSUM is negotiated, the device validates the checksum
>> +        * and virtio-net sets skb->ip_summed to CHECKSUM_UNNECESSARY;
>> +        * otherwise, virtio-net hands over to the stack to validate the checksum.
>> +        */
>> +       if (flags & VIRTIO_NET_HDR_F_NEEDS_CSUM) {
>> +               /* No need to care about SCTP because virtio-net currently doesn't
>> +                * support SCTP CRC checksum offloading, that is, SCTP packets have
>> +                * complete checksums.
>> +                */
>> +               err = skb_checksum_setup(skb, true);
> A second thought, any reason why a checksum is a must here. Could we simply:

When net.ipv4.ip_forward sysctl is enabled, such packets may be 
forwarded (return to the tx path) at the IP layer.
If the device has the tx hw checksum offloading cap, packets will have 
complete checksums based on our calculated 'check' value.

>
> 1) probe the csum_start/offset
> 2) leave it as CHECKSUM_PARTIAL
>
> ?

The reason is as I explained above.

>
>> +       } else if (flags & VIRTIO_NET_HDR_F_DATA_VALID) {
>> +               /* XDP guarantees that packets marked as VIRTIO_NET_HDR_F_DATA_VALID
>> +                * still have correct checksum after they are processed.
>> +                */
> Do you mean it's the charge of the XDP program to calculate the csum
> in this case? Seems strange.

Packet with complete checksum (and has been verified by rx device 
because it has VIRTIO_NET_HDR_F_DATA_VALID)
when modified by XDP, XDP program should use the helper provided by XDP 
core to make the checksum correct,
otherwise, VIRTIO_NET_HDR_F_DATA_VALID has been cleared and skb 
->ip_summed=CHECKSUM_NONE, so the stack
will re-verify the checksum, causing packet loss due to wrong checksum.

Thanks.

>
> Thanks
>
>> +               skb->ip_summed = CHECKSUM_UNNECESSARY;
>> +       }
>> +
>> +       return err;
>> +}
>> +
>>   static void receive_buf(struct virtnet_info *vi, struct receive_queue *rq,
>>                          void *buf, unsigned int len, void **ctx,
>>                          unsigned int *xdp_xmit,
>> @@ -1576,6 +1611,7 @@ static void receive_buf(struct virtnet_info *vi, struct receive_queue *rq,
>>          struct net_device *dev = vi->dev;
>>          struct sk_buff *skb;
>>          struct virtio_net_hdr_mrg_rxbuf *hdr;
>> +       __u8 flags;
>>
>>          if (unlikely(len < vi->hdr_len + ETH_HLEN)) {
>>                  pr_debug("%s: short packet %i\n", dev->name, len);
>> @@ -1584,6 +1620,12 @@ static void receive_buf(struct virtnet_info *vi, struct receive_queue *rq,
>>                  return;
>>          }
>>
>> +       /* XDP may modify/overwrite the packet, including the virtnet hdr,
>> +        * so save the flags of the virtnet hdr before XDP processing.
>> +        */
>> +       if (unlikely(vi->xdp_enabled))
>> +               flags = ((struct virtio_net_hdr_mrg_rxbuf *)buf)->hdr.flags;
>> +
>>          if (vi->mergeable_rx_bufs)
>>                  skb = receive_mergeable(dev, vi, rq, buf, ctx, len, xdp_xmit,
>>                                          stats);
>> @@ -1595,23 +1637,37 @@ static void receive_buf(struct virtnet_info *vi, struct receive_queue *rq,
>>          if (unlikely(!skb))
>>                  return;
>>
>> -       hdr = skb_vnet_hdr(skb);
>> -       if (dev->features & NETIF_F_RXHASH && vi->has_rss_hash_report)
>> -               virtio_skb_set_hash((const struct virtio_net_hdr_v1_hash *)hdr, skb);
>> -
>> -       if (hdr->hdr.flags & VIRTIO_NET_HDR_F_DATA_VALID)
>> -               skb->ip_summed = CHECKSUM_UNNECESSARY;
>> +       if (unlikely(vi->xdp_enabled)) {
>> +               /* Required to do this before re-probing and calculating
>> +                * the pseudo-header checksum.
>> +                */
>> +               skb->protocol = eth_type_trans(skb, dev);
>> +               skb_reset_network_header(skb);
>> +               if (virtnet_set_csum_after_xdp(vi, skb, flags) < 0) {
>> +                       pr_debug("%s: errors occurred in setting partial csum",
>> +                                dev->name);
>> +                       goto frame_err;
>> +               }
>> +       } else {
>> +               hdr = skb_vnet_hdr(skb);
>> +               if (dev->features & NETIF_F_RXHASH && vi->has_rss_hash_report)
>> +                       virtio_skb_set_hash((const struct virtio_net_hdr_v1_hash *)hdr, skb);
>> +
>> +               if (hdr->hdr.flags & VIRTIO_NET_HDR_F_DATA_VALID)
>> +                       skb->ip_summed = CHECKSUM_UNNECESSARY;
>> +
>> +               if (virtio_net_hdr_to_skb(skb, &hdr->hdr,
>> +                                         virtio_is_little_endian(vi->vdev))) {
>> +                       net_warn_ratelimited("%s: bad gso: type: %u, size: %u\n",
>> +                                            dev->name, hdr->hdr.gso_type,
>> +                                            hdr->hdr.gso_size);
>> +                       goto frame_err;
>> +               }
>>
>> -       if (virtio_net_hdr_to_skb(skb, &hdr->hdr,
>> -                                 virtio_is_little_endian(vi->vdev))) {
>> -               net_warn_ratelimited("%s: bad gso: type: %u, size: %u\n",
>> -                                    dev->name, hdr->hdr.gso_type,
>> -                                    hdr->hdr.gso_size);
>> -               goto frame_err;
>> +               skb->protocol = eth_type_trans(skb, dev);
>>          }
>>
>>          skb_record_rx_queue(skb, vq2rxq(rq->vq));
>> -       skb->protocol = eth_type_trans(skb, dev);
>>          pr_debug("Receiving skb proto 0x%04x len %i type %i\n",
>>                   ntohs(skb->protocol), skb->len, skb->pkt_type);
>>
>> --
>> 2.19.1.6.gb485710b
>>


Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ