[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20240104005327.a6747bpbqt24xlbo@skbuf>
Date: Thu, 4 Jan 2024 02:53:27 +0200
From: Vladimir Oltean <olteanv@...il.com>
To: Linus Walleij <linus.walleij@...aro.org>
Cc: Hans Ulli Kroll <ulli.kroll@...glemail.com>,
"David S. Miller" <davem@...emloft.net>,
Eric Dumazet <edumazet@...gle.com>,
Jakub Kicinski <kuba@...nel.org>, Paolo Abeni <pabeni@...hat.com>,
Household Cang <canghousehold@....com>,
Romain Gantois <romain.gantois@...tlin.com>, netdev@...r.kernel.org
Subject: Re: [PATCH net v5 2/2] net: ethernet: cortina: Bypass checksumming
engine of alien ethertypes
On Tue, Jan 02, 2024 at 09:34:26PM +0100, Linus Walleij wrote:
> We had workarounds were the ethernet checksumming engine would be bypassed
s/were/where/
> for larger frames, this fixed devices using DSA, but regressed devices
> where the ethernet was connected directly to a PHY.
>
> The devices with a PHY connected directly can't handle large frames
> either way, with or without bypass. Looking at the size of the frame
> is probably just wrong.
"Looking at the size of the frame is probably just wrong." yet you keep it.
Not only is this confusing for you to say this, but I believe that the
skb->len check is the _only_ thing that is needed. Explanation below.
> Rework the workaround such that we don't activate the checksumming engine if
> the ethertype inside the actual frame is something else than 0x0800
> (IPv4) or 0x86dd (IPv6). These are the only frames the checksumming engine
> can actually handle. VLAN framing (0x8100) also works fine.
Premise:
This driver does not set NETIF_F_IP_CSUM | NETIF_F_IPV6_CSUM (or anything)
in dev->vlan_features. Upper interface drivers which look at dev->vlan_features
in order to determine their own features are 8021q and DSA.
Packets transmitted through stacked interfaces have 3 checksumming points.
Two in software, during validate_xmit_skb() on the respective netdev,
depending on its features and skb->ip_summed, and one in the xmit
procedure of the hardware driver - gmac_start_xmit().
In short, I believe that the code which you have added to inspect the
ethertype - and based on that to avoid the "if (skb->ip_summed == CHECKSUM_PARTIAL)"
test - is bogus (a cost you are paying for nothing).
I'm saying this because I think that those
"(ethertype != ETH_P_IP && ethertype != ETH_P_IPV6)" frames wouldn't
have entered the "skb->ip_summed == CHECKSUM_PARTIAL" test anyway.
DSA-tagged frames should come with CHECKSUM_NONE, having been checksummed
in software already, by the first validate_xmit_skb() - DSA not having
inherited the checksum offload feature, because it's not in dev->vlan_features.
Coincidentally, this is also the reason why in your tests, DSA-tagged
TCP/UDP traffic still has a proper checksum, despite you bypassing the
hardware offload, and no longer calling skb_checksum_help() from the
driver. It was never needed, because the checksum was always already
calculated.
And VLAN traffic should also come with CHECKSUM_NONE, for the same reason.
The one difference between DSA and VLAN is that for DSA, you sometimes
set TSS_BYPASS_BIT (for large frames) and for VLAN you never do.
>
> We can't inspect skb->protocol because DSA frames will sometimes have a
> custom ethertype despite skb->protocol is e.g. 0x0800.
>
> If the frame is ALSO over the size of an ordinary ethernet frame,
> we will actively bypass the checksumming engine. (Always doing this
> makes the hardware unstable.)
>
> After this both devices with direct ethernet attached such as D-Link
> DNS-313 and devices with a DSA switch with a custom ethertype such as
> D-Link DIR-685 work fine.
>
> Fixes: d4d0c5b4d279 ("net: ethernet: cortina: Handle large frames")
> Signed-off-by: Linus Walleij <linus.walleij@...aro.org>
> ---
> drivers/net/ethernet/cortina/gemini.c | 33 +++++++++++++++++++++++++--------
> 1 file changed, 25 insertions(+), 8 deletions(-)
>
> diff --git a/drivers/net/ethernet/cortina/gemini.c b/drivers/net/ethernet/cortina/gemini.c
> index 5e399c6e095b..68da4ae26248 100644
> --- a/drivers/net/ethernet/cortina/gemini.c
> +++ b/drivers/net/ethernet/cortina/gemini.c
> @@ -1142,22 +1143,38 @@ static int gmac_map_tx_bufs(struct net_device *netdev, struct sk_buff *skb,
> struct gmac_txdesc *txd;
> skb_frag_t *skb_frag;
> dma_addr_t mapping;
> + u16 ethertype;
> void *buffer;
>
> /* TODO: implement proper TSO using MTU in word3 */
> word1 = skb->len;
> word3 = SOF_BIT | skb->len;
>
> - if (skb->ip_summed == CHECKSUM_PARTIAL) {
> + /* Dig out the the ethertype actually in the buffer and not what the
> + * protocol claims to be. This is the raw data that the checksumming
> + * offload engine will have to deal with.
> + */
> + ethertype = ntohs(eth_header_parse_protocol(skb));
> + /* This is the only VLAN type supported by this hardware so check for
> + * that: the checksumming engine can handle IP and IPv6 inside 802.1Q.
> + */
> + if (ethertype == ETH_P_8021Q)
> + ethertype = ntohs(__vlan_get_protocol(skb, htons(ethertype), NULL));
Random fact: if you store "ethertype" as __be16 and perform htons() on the
constant value instead, the htons() operation will be performed at compile
time and should result in fewer instructions per packet in the fast path.
> +
> + if (ethertype != ETH_P_IP && ethertype != ETH_P_IPV6) {
> + /* Hardware offloaded checksumming isn't working on non-IP frames.
> + * This happens for example on some DSA switches using a custom
> + * ethertype. When a frame gets bigger than a standard ethernet
> + * frame, it also needs to actively bypass the checksumming engine.
> + * There is no clear explanation to why it is like this, the
> + * reference manual has left the TSS completely undocumented.
> + */
> + if (skb->len > ETH_FRAME_LEN)
> + word1 |= TSS_BYPASS_BIT;
Do you know what "TSS_BYPASS_BIT" does, exactly?
> + } else if (skb->ip_summed == CHECKSUM_PARTIAL) {
> int tcp = 0;
>
> - /* We do not switch off the checksumming on non TCP/UDP
> - * frames: as is shown from tests, the checksumming engine
> - * is smart enough to see that a frame is not actually TCP
> - * or UDP and then just pass it through without any changes
> - * to the frame.
> - */
> - if (skb->protocol == htons(ETH_P_IP)) {
> + if (ethertype == ETH_P_IP) {
> word1 |= TSS_IP_CHKSUM_BIT;
> tcp = ip_hdr(skb)->protocol == IPPROTO_TCP;
> } else { /* IPv6 */
>
> --
> 2.34.1
>
Powered by blists - more mailing lists