[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Message-ID: <aFuEHpbjGILWich1@pengutronix.de>
Date: Wed, 25 Jun 2025 07:07:42 +0200
From: Oleksij Rempel <o.rempel@...gutronix.de>
To: Jakub Kicinski <kuba@...nel.org>
Cc: "David S. Miller" <davem@...emloft.net>,
Eric Dumazet <edumazet@...gle.com>, Paolo Abeni <pabeni@...hat.com>,
Simon Horman <horms@...nel.org>, kernel@...gutronix.de,
linux-kernel@...r.kernel.org, netdev@...r.kernel.org,
Maxime Chevallier <maxime.chevallier@...tlin.com>
Subject: Re: [PATCH net-next v4 0/4] net: selftest: improve test string
formatting and checksum handling
On Tue, Jun 24, 2025 at 09:09:53AM -0700, Jakub Kicinski wrote:
> On Tue, 24 Jun 2025 10:26:02 +0200 Oleksij Rempel wrote:
> > On Mon, Jun 23, 2025 at 10:19:20AM -0700, Jakub Kicinski wrote:
> > > On Mon, 23 Jun 2025 13:45:41 +0200 Oleksij Rempel wrote:
> > > > On Sat, Jun 21, 2025 at 06:46:00AM -0700, Jakub Kicinski wrote:
> > > > > On Fri, 20 Jun 2025 12:53:23 +0200 Oleksij Rempel wrote:
> > > > > > Let me first describe the setup where this issue was observed and my findings.
> > > > > > The problem occurs on a system utilizing a Microchip DSA driver with an STMMAC
> > > > > > Ethernet controller attached to the CPU port.
> > > > > >
> > > > > > In the current selftest implementation, the TCP checksum validation fails,
> > > > > > while the UDP test passes. The existing code prepares the skb for hardware
> > > > > > checksum offload by setting skb->ip_summed = CHECKSUM_PARTIAL. For TCP, it sets
> > > > > > the thdr->check field to the complement of the pseudo-header checksum, and for
> > > > > > UDP, it uses udp4_hwcsum. If I understand it correct, this configuration tells
> > > > > > the kernel that the hardware should perform the checksum calculation.
> > > > > >
> > > > > > However, during testing, I noticed that "rx-checksumming" is enabled by default
> > > > > > on the CPU port, and this leads to the TCP test failure. Only after disabling
> > > > > > "rx-checksumming" on the CPU port did the selftest pass. This suggests that the
> > > > > > issue is specifically related to the hardware checksum offload mechanism in
> > > > > > this particular setup. The behavior indicates that something on the path
> > > > > > recalculated the checksum incorrectly.
> > > > >
> > > > > Interesting, that sounds like the smoking gun. When rx-checksumming
> > > > > is enabled the packet still reaches the stack right?
> > > >
> > > > No. It looks like this packets are just silently dropped, before they was
> > > > seen by the stack. The only counter which confirms presence of this
> > > > frames is HW specific mmc_rx_tcp_err. But it will be increasing even if
> > > > rx-checksumming is disabled and packets are forwarded to the stack.
> > >
> > > If you happen to have the docs for the STMMAC instantiation in the SoC
> > > it'd be good to check if discarding frames with bad csum can be
> > > disabled. Various monitoring systems will expect the L4 checksum errors
> > > to appear in nstat, not some obscure ethtool -S counter.
> >
> > Ack. I will it add to my todo.
> >
> > For proper understanding of STMMAC and other drivers, here is how I currently
> > understand the expected behavior on the receive path, with some open questions:
> >
> > Receive Path Checksum Scenarios
> >
> > * No Hardware Verification
> > * The hardware is not configured for RX checksum offload
> > or does not support the packet type, passing the packet to the driver
> > as-is.
> > * Expected driver behavior: The driver should set the packet's state to
> > `CHECKSUM_NONE`, signaling to the kernel that a software checksum
> > validation is required.
> >
> > * Hardware Verifies and Reports All Frames (Ideal Linux Behavior)
> > * The hardware is configured not to drop packets with bad checksums.
> > It verifies the checksum of each packet and reports the result (good
> > or bad) in a status field on the DMA descriptor.
> > * Expected driver behavior: The driver must read the status for every
> > packet.
> > * If the hardware reports the checksum is good, the driver should set
> > the packet's state to `CHECKSUM_UNNECESSARY`.
> > * If the hardware reports the checksum is bad, the driver should set
> > the packet's state to `CHECKSUM_NONE` and still pass it to the
> > kernel.
> > * Open Questions:
> > * When the hardware reports a bad checksum in this mode, should the
> > driver increment `rx_crc_errors` immediately? Or should it only set
> > the packet's state to `CHECKSUM_NONE` and let the kernel stack find
> > the error and increment the counter, in order to avoid
> > double-counting the same error?
>
> Driver can increment its local counter. It doesn't matter much.
>
> But one important distinction, we're talking about layer 3 and up
> checksums. IPv4 checksum, and TCP/UDP checksums. Those are not CRC.
> The HW _should_ discard packets with bad CRC / Layer 2 checksum
> unless the NETIF_F_RXALL feature is enabled.
>
> > * Hardware Verifies and Drops on Error
> > * The hardware's RX checksum engine is active and configured to
> > automatically discard any packet with an incorrect checksum before it is
> > delivered to the driver.
> > * Open Questions:
> >
> > * When reporting these hardware-level drops, what is the most
> > appropriate existing standard `net_device_stats` counter to use
> > (e.g., `rx_crc_errors`, `rx_errors`)?
>
> I'd say rx_errors, most likely to be noticed.
>
> > * If no existing standard counter is a good semantic fit, add new
> > standard counters?
>
> Given this is behavior we don't want to encourage I think adding a
> standard stat would send the wrong signal.
>
> > * If the "drop on error" feature cannot be disabled independently,
> > and reporting the error via a standard counter is not feasible,
> > does this imply that the entire RX checksum offload feature must be
> > disabled to ensure error visibility?
>
> Probably not, users should also monitor rx_errors.
>
> > * Hardware Provides Full Packet Checksum (`CHECKSUM_COMPLETE`)
> > * The hardware calculates a single checksum over the entire packet and
> > provides this value to the driver, without needing to parse the
> > L3/L4 headers.
>
> Not entire, it skips the base Ethernet header (first 14 bytes)
>
> > * Expected driver behavior: The driver should place the checksum provided
> > by the hardware into the `skb->csum` field and set the packet's state
> > to `CHECKSUM_COMPLETE`.
>
> Correct.
Hm... at least part of this behavior can be verified with self-tests:
- Send a TCP packet with an intentionally incorrect checksum,
ensuring its state is CHECKSUM_NONE so the transmit path doesn't change it.
- Test if we receive this packet back via the PHY loopback.
- If received: The test checks the ip_summed status of the
received packet.
- A status of CHECKSUM_NONE indicates the hardware correctly passed
the packet up without validating it.
- A status of CHECKSUM_UNNECESSARY indicates a failure, as the hardware
or driver incorrectly marked a bad checksum as good.
- If not received (after a timeout): The test then checks the device's
error statistics.
- If the rx_errors counter has incremented
- If the counter has not incremented, the packet was lost for an unknown
reason, and the test fails.
What do you think?
Best Regards,
Oleksij
--
Pengutronix e.K. | |
Steuerwalder Str. 21 | http://www.pengutronix.de/ |
31137 Hildesheim, Germany | Phone: +49-5121-206917-0 |
Amtsgericht Hildesheim, HRA 2686 | Fax: +49-5121-206917-5555 |
Powered by blists - more mailing lists