[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <6409f8bf71c9e_1abbab2088e@willemb.c.googlers.com.notmuch>
Date: Thu, 09 Mar 2023 10:18:23 -0500
From: Willem de Bruijn <willemdebruijn.kernel@...il.com>
To: 沈安琪(凛玥) <amy.saq@...group.com>,
Willem de Bruijn <willemdebruijn.kernel@...il.com>,
netdev@...r.kernel.org
Cc: mst@...hat.com, davem@...emloft.net, jasowang@...hat.com,
谈鉴锋 <henry.tjf@...group.com>
Subject: Re: [PATCH v3] net/packet: support mergeable feature of virtio
沈安琪(凛玥) wrote:
>
> 在 2023/3/7 下午11:49, Willem de Bruijn 写道:
> > 沈安琪(凛玥) wrote:
> >> From: Jianfeng Tan <henry.tjf@...group.com>
> >>
> >> Packet sockets, like tap, can be used as the backend for kernel vhost.
> >> In packet sockets, virtio net header size is currently hardcoded to be
> >> the size of struct virtio_net_hdr, which is 10 bytes; however, it is not
> >> always the case: some virtio features, such as mrg_rxbuf, need virtio
> >> net header to be 12-byte long.
> >>
> >> Mergeable buffers, as a virtio feature, is worthy of supporting: packets
> >> that are larger than one-mbuf size will be dropped in vhost worker's
> >> handle_rx if mrg_rxbuf feature is not used, but large packets
> >> cannot be avoided and increasing mbuf's size is not economical.
> >>
> >> With this mergeable feature enabled by virtio-user, packet sockets with
> >> hardcoded 10-byte virtio net header will parse mac head incorrectly in
> >> packet_snd by taking the last two bytes of virtio net header as part of
> >> mac header.
> >> This incorrect mac header parsing will cause packet to be dropped due to
> >> invalid ether head checking in later under-layer device packet receiving.
> >>
> >> By adding extra field vnet_hdr_sz with utilizing holes in struct
> >> packet_sock to record currently used virtio net header size and supporting
> >> extra sockopt PACKET_VNET_HDR_SZ to set specified vnet_hdr_sz, packet
> >> sockets can know the exact length of virtio net header that virtio user
> >> gives.
> >> In packet_snd, tpacket_snd and packet_recvmsg, instead of using
> >> hardcoded virtio net header size, it can get the exact vnet_hdr_sz from
> >> corresponding packet_sock, and parse mac header correctly based on this
> >> information to avoid the packets being mistakenly dropped.
> >>
> >> Besides, has_vnet_hdr field in struct packet_sock is removed since all
> >> the information it provides is covered by vnet_hdr_sz field: a packet
> >> socket has a vnet header if and only if its vnet_hdr_sz is not zero.
> >>
> >> Signed-off-by: Jianfeng Tan <henry.tjf@...group.com>
> >> Co-developed-by: Anqi Shen <amy.saq@...group.com>
> >> Signed-off-by: Anqi Shen <amy.saq@...group.com>
> >> ---
> >> diff --git a/net/packet/internal.h b/net/packet/internal.h
> >> index 48af35b..9b52d93 100644
> >> --- a/net/packet/internal.h
> >> +++ b/net/packet/internal.h
> >> @@ -119,9 +119,9 @@ struct packet_sock {
> >> unsigned int running; /* bind_lock must be held */
> >> unsigned int auxdata:1, /* writer must hold sock lock */
> >> origdev:1,
> >> - has_vnet_hdr:1,
> >> tp_loss:1,
> >> - tp_tx_has_off:1;
> >> + tp_tx_has_off:1,
> >> + vnet_hdr_sz:8;
> > just a separate u8 variable , rather than 8 bits in a u32.
> >
> >> int pressure;
> >> int ifindex; /* bound device */
>
>
> We plan to add
>
> + u8 vnet_hdr_sz:8;
>
> here.
> Is this a proper place to add this field to make sure the cacheline will not be broken?
When in doubt, use pahole (`pahole -C packet_sock net/packet/af_packet.o`).
There currently is a 27-bit hole before pressure. That would be a good spot.
>
> >> __be16 num;
> >> --
> >> 1.8.3.1
> >>
Powered by blists - more mailing lists