lists.openwall.net | lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening PHC | |
Open Source and information security mailing list archives
| ||
|
Date: Wed, 28 Dec 2022 14:27:06 +0800 From: Jason Wang <jasowang@...hat.com> To: Heng Qi <hengqi@...ux.alibaba.com>, netdev@...r.kernel.org, bpf@...r.kernel.org Cc: "Michael S . Tsirkin" <mst@...hat.com>, Paolo Abeni <pabeni@...hat.com>, Jakub Kicinski <kuba@...nel.org>, John Fastabend <john.fastabend@...il.com>, "David S . Miller" <davem@...emloft.net>, Daniel Borkmann <daniel@...earbox.net>, Alexei Starovoitov <ast@...nel.org>, Eric Dumazet <edumazet@...gle.com>, Xuan Zhuo <xuanzhuo@...ux.alibaba.com> Subject: Re: [PATCH v2 4/9] virtio_net: build xdp_buff with multi buffers 在 2022/12/27 17:10, Heng Qi 写道: > > > 在 2022/12/27 下午2:46, Jason Wang 写道: >> >> 在 2022/12/20 22:14, Heng Qi 写道: >>> Support xdp for multi buffer packets in mergeable mode. >>> >>> Putting the first buffer as the linear part for xdp_buff, >>> and the rest of the buffers as non-linear fragments to struct >>> skb_shared_info in the tailroom belonging to xdp_buff. >>> >>> Signed-off-by: Heng Qi <hengqi@...ux.alibaba.com> >>> Reviewed-by: Xuan Zhuo <xuanzhuo@...ux.alibaba.com> >>> --- >>> drivers/net/virtio_net.c | 78 >>> ++++++++++++++++++++++++++++++++++++++++ >>> 1 file changed, 78 insertions(+) >>> >>> diff --git a/drivers/net/virtio_net.c b/drivers/net/virtio_net.c >>> index 08f209d7b0bf..8fc3b1841d92 100644 >>> --- a/drivers/net/virtio_net.c >>> +++ b/drivers/net/virtio_net.c >>> @@ -931,6 +931,84 @@ static struct sk_buff *receive_big(struct >>> net_device *dev, >>> return NULL; >>> } >>> +/* TODO: build xdp in big mode */ >>> +static int virtnet_build_xdp_buff_mrg(struct net_device *dev, >>> + struct virtnet_info *vi, >>> + struct receive_queue *rq, >>> + struct xdp_buff *xdp, >>> + void *buf, >>> + unsigned int len, >>> + unsigned int frame_sz, >>> + u16 *num_buf, >>> + unsigned int *xdp_frags_truesize, >>> + struct virtnet_rq_stats *stats) >>> +{ >>> + unsigned int tailroom = SKB_DATA_ALIGN(sizeof(struct >>> skb_shared_info)); >>> + struct virtio_net_hdr_mrg_rxbuf *hdr = buf; >>> + unsigned int truesize, headroom; >>> + struct skb_shared_info *shinfo; >>> + unsigned int xdp_frags_truesz = 0; >>> + unsigned int cur_frag_size; >>> + struct page *page; >>> + skb_frag_t *frag; >>> + int offset; >>> + void *ctx; >>> + >>> + xdp_init_buff(xdp, frame_sz, &rq->xdp_rxq); >>> + xdp_prepare_buff(xdp, buf - VIRTIO_XDP_HEADROOM, >>> + VIRTIO_XDP_HEADROOM + vi->hdr_len, len - vi->hdr_len, >>> true); >>> + >>> + if (*num_buf > 1) { >>> + shinfo = xdp_get_shared_info_from_buff(xdp); >>> + shinfo->nr_frags = 0; >>> + shinfo->xdp_frags_size = 0; >>> + } >>> + >>> + if ((*num_buf - 1) > MAX_SKB_FRAGS) >>> + return -EINVAL; >>> + >>> + while ((--*num_buf) >= 1) { >>> + buf = virtqueue_get_buf_ctx(rq->vq, &len, &ctx); >>> + if (unlikely(!buf)) { >>> + pr_debug("%s: rx error: %d buffers out of %d missing\n", >>> + dev->name, *num_buf, >>> + virtio16_to_cpu(vi->vdev, hdr->num_buffers)); >>> + dev->stats.rx_length_errors++; >>> + return -EINVAL; >>> + } >>> + >>> + if (!xdp_buff_has_frags(xdp)) >>> + xdp_buff_set_frags_flag(xdp); >> >> >> Any reason to put this inside the loop? > > I'll move it outside the loop in the next version. > >> >> >>> + >>> + stats->bytes += len; >>> + page = virt_to_head_page(buf); >>> + offset = buf - page_address(page); >>> + truesize = mergeable_ctx_to_truesize(ctx); >>> + headroom = mergeable_ctx_to_headroom(ctx); >>> + >>> + cur_frag_size = truesize + (headroom ? (headroom + >>> tailroom) : 0); >>> + xdp_frags_truesz += cur_frag_size; >> >> >> Not related to this patch, but it would easily confuse the future >> readers that the we need another math for truesize. I think at least >> we need some comments for this or > > Yes it might, I'll add more comments on this. > >> >> I guess the root cause is in get_mergeable_buf_len: >> >> static unsigned int get_mergeable_buf_len(struct receive_queue *rq, >> struct ewma_pkt_len *avg_pkt_len, >> unsigned int room) >> { >> struct virtnet_info *vi = rq->vq->vdev->priv; >> const size_t hdr_len = vi->hdr_len; >> unsigned int len; >> >> if (room) >> return PAGE_SIZE - room; >> >> And we do >> >> len = get_mergeable_buf_len(rq, &rq->mrg_avg_pkt_len, room); >> >> ... >> >> ctx = mergeable_len_to_ctx(len, headroom); >> >> >> I wonder if it's better to pack the real truesize (PAGE_SIZE) here. >> This may ease a lot of things. > > I don't know the historical reason for not packing, but I guess this > is for the convenience of > comparing the actual length len given by the device with truesize. > Therefore, I think it would > be better to keep the above practice for now? The problem is: We had static unsigned int mergeable_ctx_to_truesize(void *mrg_ctx) It means the truesize should be calculated before packing into the context. So having another math to get truesize is very confusing, I think it's better to fix that. Otherwise we may end up code that is hard to be reviewed even with the help of comments. Thanks > Perhaps, I can add more explanation to the > above code to help future readers try not to get confused. > > Thanks. > >> >> Thanks >> >> >>> + if (unlikely(len > truesize || cur_frag_size > PAGE_SIZE)) { >>> + pr_debug("%s: rx error: len %u exceeds truesize %lu\n", >>> + dev->name, len, (unsigned long)ctx); >>> + dev->stats.rx_length_errors++; >>> + return -EINVAL; >>> + } >>> + >>> + frag = &shinfo->frags[shinfo->nr_frags++]; >>> + __skb_frag_set_page(frag, page); >>> + skb_frag_off_set(frag, offset); >>> + skb_frag_size_set(frag, len); >>> + if (page_is_pfmemalloc(page)) >>> + xdp_buff_set_frag_pfmemalloc(xdp); >>> + >>> + shinfo->xdp_frags_size += len; >>> + } >>> + >>> + *xdp_frags_truesize = xdp_frags_truesz; >>> + return 0; >>> +} >>> + >>> static struct sk_buff *receive_mergeable(struct net_device *dev, >>> struct virtnet_info *vi, >>> struct receive_queue *rq, >
Powered by blists - more mailing lists