[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <CAM_iQpUzVH9MkXTB5XNsQobodrN6ZKYj1RKNMcDNEMivZXfGfA@mail.gmail.com>
Date: Mon, 3 Dec 2018 22:48:23 -0800
From: Cong Wang <xiyou.wangcong@...il.com>
To: Eric Dumazet <edumazet@...gle.com>
Cc: Linux Kernel Network Developers <netdev@...r.kernel.org>,
Saeed Mahameed <saeedm@...lanox.com>,
Tariq Toukan <tariqt@...lanox.com>
Subject: Re: [Patch net v3] mlx5: force CHECKSUM_NONE for short ethernet frames
On Mon, Dec 3, 2018 at 10:34 PM Eric Dumazet <edumazet@...gle.com> wrote:
>
> On Mon, Dec 3, 2018 at 10:14 PM Cong Wang <xiyou.wangcong@...il.com> wrote:
> >
> > When an ethernet frame is padded to meet the minimum ethernet frame
> > size, the padding octets are not covered by the hardware checksum.
> > Fortunately the padding octets are ususally zero's, which don't affect
> > checksum. However, we have a switch which pads non-zero octets, this
> > causes kernel hardware checksum fault repeatedly.
> >
> > Prior to commit 88078d98d1bb ("net: pskb_trim_rcsum() and CHECKSUM_COMPLETE are friends"),
> > skb checksum was forced to be CHECKSUM_NONE when padding is detected.
> > After it, we need to keep skb->csum updated, like what we do for RXFCS.
> > However, fixing up CHECKSUM_COMPLETE requires to verify and parse IP
> > headers, it is not worthy the effort as the packets are so small that
> > CHECKSUM_COMPLETE can't save anything.
> >
> > I tested this patch with RXFCS on and off, it works fine without any
> > warning in both cases.
> >
> > Fixes: 88078d98d1bb ("net: pskb_trim_rcsum() and CHECKSUM_COMPLETE are friends"),
> > Cc: Saeed Mahameed <saeedm@...lanox.com>
> > Cc: Eric Dumazet <edumazet@...gle.com>
> > Cc: Tariq Toukan <tariqt@...lanox.com>
> > Signed-off-by: Cong Wang <xiyou.wangcong@...il.com>
> > ---
> > .../net/ethernet/mellanox/mlx5/core/en_rx.c | 22 ++++++++++++++++++-
> > 1 file changed, 21 insertions(+), 1 deletion(-)
> >
> > diff --git a/drivers/net/ethernet/mellanox/mlx5/core/en_rx.c b/drivers/net/ethernet/mellanox/mlx5/core/en_rx.c
> > index 624eed345b5d..1c153b8091da 100644
> > --- a/drivers/net/ethernet/mellanox/mlx5/core/en_rx.c
> > +++ b/drivers/net/ethernet/mellanox/mlx5/core/en_rx.c
> > @@ -732,6 +732,13 @@ static u8 get_ip_proto(struct sk_buff *skb, int network_depth, __be16 proto)
> > ((struct ipv6hdr *)ip_p)->nexthdr;
> > }
> >
> > +static bool is_short_frame(struct sk_buff *skb, bool has_fcs)
> > +{
> > + u32 frame_len = has_fcs ? skb->len - ETH_FCS_LEN : skb->len;
> > +
> > + return frame_len <= ETH_ZLEN;
> > +}
> > +
> > static inline void mlx5e_handle_csum(struct net_device *netdev,
> > struct mlx5_cqe64 *cqe,
> > struct mlx5e_rq *rq,
> > @@ -755,9 +762,22 @@ static inline void mlx5e_handle_csum(struct net_device *netdev,
> > goto csum_unnecessary;
> >
> > if (likely(is_last_ethertype_ip(skb, &network_depth, &proto))) {
> > + bool has_fcs = !!(netdev->features & NETIF_F_RXFCS);
> > +
> > if (unlikely(get_ip_proto(skb, network_depth, proto) == IPPROTO_SCTP))
> > goto csum_unnecessary;
> >
> > + /* CQE csum doesn't cover padding octets in short ethernet
> > + * frames. And the pad field is appended prior to calculating
> > + * and appending the FCS field.
> > + *
> > + * Detecting these padded frames requires to verify and parse
> > + * IP headers, so we simply force all those small frames to be
> > + * CHECKSUM_NONE even if they are not padded.
> > + */
> > + if (unlikely(is_short_frame(skb, has_fcs)))
> > + goto csum_none;
>
> Should not this go to csum_unnecessary instead ?
I don't see why we don't even want to validate the protocol checksum
here.
Any reason you are suggesting so?
>
> Probably not a big deal, but small UDP frames might hit this code path,
> so ethtool -S would show a lot of csum_none which could confuse mlx5 owners.
Why it is confusing? We intentionally bypass hardware checksum
and let protocol layer validate it.
>
> BTW,
> It looks like mlx5 prefers delivering skbs with CHECKSUM_COMPLETE instead of
> CHECKSUM_UNNECESSARY but at least for ipv6 CHECKSUM_UNNECESSARY
> would be slightly faster, by avoiding various csum_partial() costs
> when headers are parsed.
Sure, it is certainly faster if you don't want to validate L4 checksum.
The only question is why we don't either validate hardware checksum
or L4 checksum?
Thanks.
Powered by blists - more mailing lists