[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20231108152736.euqbedurxvq2wben@skbuf>
Date: Wed, 8 Nov 2023 17:27:36 +0200
From: Vladimir Oltean <olteanv@...il.com>
To: Linus Walleij <linus.walleij@...aro.org>
Cc: Hans Ulli Kroll <ulli.kroll@...glemail.com>,
"David S. Miller" <davem@...emloft.net>,
Eric Dumazet <edumazet@...gle.com>,
Jakub Kicinski <kuba@...nel.org>, Paolo Abeni <pabeni@...hat.com>,
Michał Mirosław <mirq-linux@...e.qmqm.pl>,
Andrew Lunn <andrew@...n.ch>, linux-arm-kernel@...ts.infradead.org,
netdev@...r.kernel.org, linux-kernel@...r.kernel.org
Subject: Re: [PATCH net v3 3/4] net: ethernet: cortina: Handle large frames
On Tue, Nov 07, 2023 at 10:54:28AM +0100, Linus Walleij wrote:
> The Gemini ethernet controller provides hardware checksumming
> for frames up to 1514 bytes including ethernet headers but not
> FCS.
>
> If we start sending bigger frames (after first bumping up the MTU
> on both interfaces sending and receiveing the frames), truncated
s/receiveing/receiving/
> packets start to appear on the target such as in this tcpdump
> resulting from ping -s 1474:
>
> 23:34:17.241983 14:d6:4d:a8:3c:4f (oui Unknown) > bc:ae:c5:6b:a8:3d (oui Unknown),
> ethertype IPv4 (0x0800), length 1514: truncated-ip - 2 bytes missing!
> (tos 0x0, ttl 64, id 32653, offset 0, flags [DF], proto ICMP (1), length 1502)
> OpenWrt.lan > Fecusia: ICMP echo request, id 1672, seq 50, length 1482
>
> If we bypass the hardware checksumming and provide a software
> fallback, everything starts working fine up to the max TX MTU
> of 2047 bytes, for example ping -s2000 192.168.1.2:
>
> 00:44:29.587598 bc:ae:c5:6b:a8:3d (oui Unknown) > 14:d6:4d:a8:3c:4f (oui Unknown),
> ethertype IPv4 (0x0800), length 2042:
> (tos 0x0, ttl 64, id 51828, offset 0, flags [none], proto ICMP (1), length 2028)
> Fecusia > OpenWrt.lan: ICMP echo reply, id 1683, seq 4, length 2008
>
> The bit enabling to bypass hardware checksum (or any of the
> "TSS" bits) are undocumented in the hardware reference manual.
> The entire hardware checksum unit appears undocumented. The
> conclusion that we need to use the "bypass" bit was found by
> trial-and-error.
>
> Since no hardware checksum will happen, we slot in a software
> checksum fallback.
>
> Check for the condition where we need to compute checksum on the
> skb with either hardware or software using == CHECKSUM_PARTIAL instead
> of != CHECKSUM_NONE which is an incomplete check according to
> <linux/skbuff.h>.
>
> We delete the code disabling the hardware checksum for large
> MTU:s: this is suboptimal because it will disable hardware
"MTUs" maybe?
> checksumming also on small packets which the checksumming
> engine can handle just fine, which is a waste of resources.
>
> On the D-Link DIR-685 router this fixes a bug on the conduit
> interface to the RTL8366RB DSA switch: as the switch needs to add
> space for its tag it increases the MTU on the conduit interface
> to 1504 and that means that when the router sends packages
> of 1500 bytes these get an extra 4 bytes of DSA tag and the
> transfer fails because of the erroneous hardware checksumming,
> affecting such basic functionality as the LuCI web interface.
>
> Suggested-by: Vladimir Oltean <olteanv@...il.com>
> Fixes: 4d5ae32f5e1e ("net: ethernet: Add a driver for Gemini gigabit ethernet")
> Signed-off-by: Linus Walleij <linus.walleij@...aro.org>
> ---
> drivers/net/ethernet/cortina/gemini.c | 34 +++++++++++++++++++++++-----------
> 1 file changed, 23 insertions(+), 11 deletions(-)
>
> diff --git a/drivers/net/ethernet/cortina/gemini.c b/drivers/net/ethernet/cortina/gemini.c
> index b21a94b4ab5c..78287cfcbf63 100644
> --- a/drivers/net/ethernet/cortina/gemini.c
> +++ b/drivers/net/ethernet/cortina/gemini.c
> @@ -1145,6 +1145,7 @@ static int gmac_map_tx_bufs(struct net_device *netdev, struct sk_buff *skb,
> dma_addr_t mapping;
> unsigned short mtu;
> void *buffer;
> + int ret;
>
> mtu = ETH_HLEN;
> mtu += netdev->mtu;
> @@ -1159,9 +1160,30 @@ static int gmac_map_tx_bufs(struct net_device *netdev, struct sk_buff *skb,
> word3 |= mtu;
> }
>
> - if (skb->ip_summed != CHECKSUM_NONE) {
> + if (skb->len >= ETH_FRAME_LEN) {
> + /* Hardware offloaded checksumming isn't working on frames
> + * bigger than 1514 bytes. A hypothesis about this is that the
> + * checksum buffer is only 1518 bytes, so when the frames get
> + * bigger they get truncated, or the last few bytes get
> + * overwritten by the FCS.
> + *
> + * Just use software checksumming and bypass on bigger frames.
> + */
> + if (skb->ip_summed == CHECKSUM_PARTIAL) {
> + ret = skb_checksum_help(skb);
> + if (ret)
> + return ret;
> + }
> + word1 |= TSS_BYPASS_BIT;
> + } else if (skb->ip_summed == CHECKSUM_PARTIAL) {
> int tcp = 0;
>
> + /* We do not switch off the checksumming on non TCP/UDP
> + * frames: as is shown from tests, the checksumming engine
> + * is smart enough to see that a frame is not actually TCP
> + * or UDP and then just pass it through without any changes
> + * to the frame.
> + */
> if (skb->protocol == htons(ETH_P_IP)) {
> word1 |= TSS_IP_CHKSUM_BIT;
> tcp = ip_hdr(skb)->protocol == IPPROTO_TCP;
> @@ -1978,15 +2000,6 @@ static int gmac_change_mtu(struct net_device *netdev, int new_mtu)
> return 0;
> }
>
> -static netdev_features_t gmac_fix_features(struct net_device *netdev,
> - netdev_features_t features)
> -{
> - if (netdev->mtu + ETH_HLEN + VLAN_HLEN > MTU_SIZE_BIT_MASK)
> - features &= ~GMAC_OFFLOAD_FEATURES;
> -
> - return features;
> -}
> -
I think this entire ndo_fix_features() can be indeed removed, but your
justification was not immediately convincing. I'd point out that after
your patch 1/4 "net: ethernet: cortina: Fix MTU max setting", you
actually made this dead code, because netdev->mtu can't be larger than
netdev->max_mtu.
If you reverse the patch order a bit, such that "net: ethernet: cortina:
Handle large frames" comes first, I think it would be much more logical
for the removal of gmac_fix_features() to be part of the commit
"net: ethernet: cortina: Fix MTU max setting", with the simple
justification: the new MTU makes the code stop having any role.
> static int gmac_set_features(struct net_device *netdev,
> netdev_features_t features)
> {
> @@ -2212,7 +2225,6 @@ static const struct net_device_ops gmac_351x_ops = {
> .ndo_set_mac_address = gmac_set_mac_address,
> .ndo_get_stats64 = gmac_get_stats64,
> .ndo_change_mtu = gmac_change_mtu,
> - .ndo_fix_features = gmac_fix_features,
> .ndo_set_features = gmac_set_features,
> };
>
>
> --
> 2.34.1
>
Powered by blists - more mailing lists