lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20181116015242.wmm6mooubjnum7qv@gondor.apana.org.au>
Date:   Fri, 16 Nov 2018 09:52:42 +0800
From:   Herbert Xu <herbert@...dor.apana.org.au>
To:     Cong Wang <xiyou.wangcong@...il.com>
Cc:     netdev@...r.kernel.org, Tom Herbert <tom@...bertland.com>,
        Eric Dumazet <edumazet@...gle.com>
Subject: Re: [Patch net] net: invert the check of detecting hardware RX
 checksum fault

On Thu, Nov 15, 2018 at 03:16:02PM -0800, Cong Wang wrote:
> The following evidences indicate this check is likely wrong:
> 
> 1. In the assignment "skb->csum_valid = !sum", sum==0 indicates a valid checksum.
> 
> 2. __skb_checksum_complete() always returns sum, and TCP packets are dropped
>    only when it returns non-zero. So non-zero indicates a failure.
> 
> 3. In __skb_checksum_validate_complete(), we have a nearly same check, where
>    zero is considered as success.
> 
> 4. csum_fold() already does the one’s complement, this indicates 0 should
>    be considered as a successful validation.
> 
> 5. We have triggered this fault for many times, but InCsumErrors field in
>    /proc/net/snmp remains 0.
> 
> Base on the above, non-zero should be used as a checksum mismatch.
> 
> I tested this with mlx5 driver, no warning or InCsumErrors after 1 hour.
> 
> Fixes: fb286bb2990a ("[NET]: Detect hardware rx checksum faults correctly")
> Cc: Herbert Xu <herbert@...dor.apana.org.au>
> Cc: Tom Herbert <tom@...bertland.com>
> Cc: Eric Dumazet <edumazet@...gle.com>
> Signed-off-by: Cong Wang <xiyou.wangcong@...il.com>
> ---
>  net/core/datagram.c | 4 ++--
>  net/core/dev.c      | 2 +-
>  2 files changed, 3 insertions(+), 3 deletions(-)
> 
> diff --git a/net/core/datagram.c b/net/core/datagram.c
> index 57f3a6fcfc1e..e542a9a212a7 100644
> --- a/net/core/datagram.c
> +++ b/net/core/datagram.c
> @@ -733,7 +733,7 @@ __sum16 __skb_checksum_complete_head(struct sk_buff *skb, int len)
>  	__sum16 sum;
>  
>  	sum = csum_fold(skb_checksum(skb, 0, len, skb->csum));
> -	if (likely(!sum)) {
> +	if (unlikely(sum)) {
>  		if (unlikely(skb->ip_summed == CHECKSUM_COMPLETE) &&
>  		    !skb->csum_complete_sw)
>  			netdev_rx_csum_fault(skb->dev);

Normally if the hardware's partial checksum is valid then we just
trust it and send the packet along.  However, if the partial
checksum is invalid we don't trust it and we will compute the
whole checksum manually which is what ends up in sum.

netdev_rx_csum_fault is meant to warn about the situation where
a packet with a valid checksum (i.e., sum == 0) was given to us
by the hardware with a partial checksum that was invalid.

So changing it to sum here is wrong.

Can you give more information as to how you got the warnings with
mlx5? It sounds like there may be a real bug there because if you
are getting the warning then it means that a packet with an invalid
hardware-computed partial checksum passed the manual check and
was actually valid.  This implies that either the hardware or the
driver is broken.

Cheers,
-- 
Email: Herbert Xu <herbert@...dor.apana.org.au>
Home Page: http://gondor.apana.org.au/~herbert/
PGP Key: http://gondor.apana.org.au/~herbert/pubkey.txt

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ