netdev - Re: [PATCH net-next v4 0/4] net: selftest: improve test string formatting and checksum handling

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <aFuEHpbjGILWich1@pengutronix.de>
Date: Wed, 25 Jun 2025 07:07:42 +0200
From: Oleksij Rempel <o.rempel@...gutronix.de>
To: Jakub Kicinski <kuba@...nel.org>
Cc: "David S. Miller" <davem@...emloft.net>,
	Eric Dumazet <edumazet@...gle.com>, Paolo Abeni <pabeni@...hat.com>,
	Simon Horman <horms@...nel.org>, kernel@...gutronix.de,
	linux-kernel@...r.kernel.org, netdev@...r.kernel.org,
	Maxime Chevallier <maxime.chevallier@...tlin.com>
Subject: Re: [PATCH net-next v4 0/4] net: selftest: improve test string
 formatting and checksum handling

On Tue, Jun 24, 2025 at 09:09:53AM -0700, Jakub Kicinski wrote:
> On Tue, 24 Jun 2025 10:26:02 +0200 Oleksij Rempel wrote:
> > On Mon, Jun 23, 2025 at 10:19:20AM -0700, Jakub Kicinski wrote:
> > > On Mon, 23 Jun 2025 13:45:41 +0200 Oleksij Rempel wrote:  
> > > > On Sat, Jun 21, 2025 at 06:46:00AM -0700, Jakub Kicinski wrote:  
> > > > > On Fri, 20 Jun 2025 12:53:23 +0200 Oleksij Rempel wrote:  
> > > > > > Let me first describe the setup where this issue was observed and my findings.
> > > > > > The problem occurs on a system utilizing a Microchip DSA driver with an STMMAC
> > > > > > Ethernet controller attached to the CPU port.
> > > > > > 
> > > > > > In the current selftest implementation, the TCP checksum validation fails,
> > > > > > while the UDP test passes. The existing code prepares the skb for hardware
> > > > > > checksum offload by setting skb->ip_summed = CHECKSUM_PARTIAL. For TCP, it sets
> > > > > > the thdr->check field to the complement of the pseudo-header checksum, and for
> > > > > > UDP, it uses udp4_hwcsum. If I understand it correct, this configuration tells
> > > > > > the kernel that the hardware should perform the checksum calculation.
> > > > > > 
> > > > > > However, during testing, I noticed that "rx-checksumming" is enabled by default
> > > > > > on the CPU port, and this leads to the TCP test failure.  Only after disabling
> > > > > > "rx-checksumming" on the CPU port did the selftest pass. This suggests that the
> > > > > > issue is specifically related to the hardware checksum offload mechanism in
> > > > > > this particular setup. The behavior indicates that something on the path
> > > > > > recalculated the checksum incorrectly.    
> > > > > 
> > > > > Interesting, that sounds like the smoking gun. When rx-checksumming 
> > > > > is enabled the packet still reaches the stack right?    
> > > > 
> > > > No. It looks like this packets are just silently dropped, before they was
> > > > seen by the stack. The only counter which confirms presence of this
> > > > frames is HW specific mmc_rx_tcp_err. But it will be increasing even if
> > > > rx-checksumming is disabled and packets are forwarded to the stack.  
> > > 
> > > If you happen to have the docs for the STMMAC instantiation in the SoC
> > > it'd be good to check if discarding frames with bad csum can be
> > > disabled. Various monitoring systems will expect the L4 checksum errors
> > > to appear in nstat, not some obscure ethtool -S counter.  
> > 
> > Ack. I will it add to my todo.
> > 
> > For proper understanding of STMMAC and other drivers, here is how I currently
> > understand the expected behavior on the receive path, with some open questions:
> > 
> > Receive Path Checksum Scenarios
> > 
> > * No Hardware Verification
> >     * The hardware is not configured for RX checksum offload
> >       or does not support the packet type, passing the packet to the driver
> >       as-is.
> >     * Expected driver behavior: The driver should set the packet's state to
> >       `CHECKSUM_NONE`, signaling to the kernel that a software checksum
> >       validation is required.
> > 
> > * Hardware Verifies and Reports All Frames (Ideal Linux Behavior)
> >     * The hardware is configured not to drop packets with bad checksums.
> >       It verifies the checksum of each packet and reports the result (good
> >       or bad) in a status field on the DMA descriptor.
> >     * Expected driver behavior: The driver must read the status for every
> >       packet.
> >         * If the hardware reports the checksum is good, the driver should set
> >           the packet's state to `CHECKSUM_UNNECESSARY`.
> >         * If the hardware reports the checksum is bad, the driver should set
> >           the packet's state to `CHECKSUM_NONE` and still pass it to the
> >           kernel.
> >     * Open Questions:
> >         * When the hardware reports a bad checksum in this mode, should the
> >           driver increment `rx_crc_errors` immediately? Or should it only set
> >           the packet's state to `CHECKSUM_NONE` and let the kernel stack find
> >           the error and increment the counter, in order to avoid
> >           double-counting the same error?
> 
> Driver can increment its local counter. It doesn't matter much.
> 
> But one important distinction, we're talking about layer 3 and up
> checksums. IPv4 checksum, and TCP/UDP checksums. Those are not CRC.
> The HW _should_ discard packets with bad CRC / Layer 2 checksum
> unless the NETIF_F_RXALL feature is enabled.
> 
> > * Hardware Verifies and Drops on Error
> >     * The hardware's RX checksum engine is active and configured to
> >       automatically discard any packet with an incorrect checksum before it is
> >       delivered to the driver.
> >     * Open Questions:
> > 
> >         * When reporting these hardware-level drops, what is the most
> >           appropriate existing standard `net_device_stats` counter to use
> >           (e.g., `rx_crc_errors`, `rx_errors`)?
> 
> I'd say rx_errors, most likely to be noticed.
> 
> >         * If no existing standard counter is a good semantic fit, add new
> >           standard counters?
> 
> Given this is behavior we don't want to encourage I think adding a
> standard stat would send the wrong signal.
> 
> >         * If the "drop on error" feature cannot be disabled independently,
> >           and reporting the error via a standard counter is not feasible,
> >           does this imply that the entire RX checksum offload feature must be
> >           disabled to ensure error visibility?
> 
> Probably not, users should also monitor rx_errors.
> 
> > * Hardware Provides Full Packet Checksum (`CHECKSUM_COMPLETE`)
> >     * The hardware calculates a single checksum over the entire packet and
> >       provides this value to the driver, without needing to parse the
> >       L3/L4 headers.
> 
> Not entire, it skips the base Ethernet header (first 14 bytes)
> 
> >     * Expected driver behavior: The driver should place the checksum provided
> >       by the hardware into the `skb->csum` field and set the packet's state
> >       to `CHECKSUM_COMPLETE`.
> 
> Correct.

Hm... at least part of this behavior can be verified with self-tests:

- Send a TCP packet with an intentionally incorrect checksum,
  ensuring its state is CHECKSUM_NONE so the transmit path doesn't change it.
- Test if we receive this packet back via the PHY loopback.
   - If received: The test checks the ip_summed status of the
     received packet.
      - A status of CHECKSUM_NONE indicates the hardware correctly passed
        the packet up without validating it.
      - A status of CHECKSUM_UNNECESSARY indicates a failure, as the hardware
        or driver incorrectly marked a bad checksum as good.
   - If not received (after a timeout): The test then checks the device's
     error statistics.
      - If the rx_errors counter has incremented
      - If the counter has not incremented, the packet was lost for an unknown
        reason, and the test fails.

What do you think?

Best Regards,
Oleksij
-- 
Pengutronix e.K.                           |                             |
Steuerwalder Str. 21                       | http://www.pengutronix.de/  |
31137 Hildesheim, Germany                  | Phone: +49-5121-206917-0    |
Amtsgericht Hildesheim, HRA 2686           | Fax:   +49-5121-206917-5555 |