lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Date:	Mon, 23 Apr 2007 16:35:34 -0700
From:	Andrew Morton <akpm@...ux-foundation.org>
To:	David Miller <davem@...emloft.net>
Cc:	netdev@...r.kernel.org, acme@...hat.com,
	herbert@...dor.apana.org.au
Subject: Re: net-2.6.22 UDP stalls/hangs

On Mon, 23 Apr 2007 15:45:09 -0700 (PDT)
David Miller <davem@...emloft.net> wrote:

> From: Andrew Morton <akpm@...ux-foundation.org>
> Date: Mon, 23 Apr 2007 15:37:14 -0700
> 
> > So I think we did a bit of TCP chatter then no UDP at all?
> > 
> > It's interesting that the test machine can see other people's DNS queries
> > go past.
> 
> It's mysterious alright.
> 
> I can't say that the UDP's are going out corrupted because tcpdump
> seems to decode the DNS queries just fine.  Hmmm, if you're sending
> this out on the broken machine we can't rule out corrupted checksums.
> 
> And if tcpdump doesn't see the UDP replies it means that it isn't even
> reaching the device, let alone the stack.  At least that rules out
> the stack dropping UDP packets for some reason.
> 
> It's possible we've stuffed up some expectation the e1000 driver
> has for TX checksum offload.  Turn off TX checksums with
> "ethtool -K eth0 tx off" and see if that makes the problem
> go away.  Next, try "ethtool -K eth0 rx off".
> 
> I suspect skb_transport_offset() might be wrong for UDP packets
> for some reason, as that's what drivers/net/e1000/e1000_main.c
> e1000_tx_csum() depend upon.
> 
> Either that or some error in Herbert's recent checksum offload
> handling changes, such as, in fact I am highly suspicious of
> the second change listed below, you may want to try reverting
> just that one:

Bingo.

> commit 8952d6c988ec31070732117f353666a4b9a09fea
> Author: Herbert Xu <herbert@...dor.apana.org.au>
> Date:   Mon Apr 9 11:59:39 2007 -0700
> 
>     [NET]: Treat CHECKSUM_PARTIAL as CHECKSUM_UNNECESSARY
>     
>     When a transmitted packet is looped back directly, CHECKSUM_PARTIAL
>     maps to the semantics of CHECKSUM_UNNECESSARY.  Therefore we should
>     treat it as such in the stack.
>     
>     Signed-off-by: Herbert Xu <herbert@...dor.apana.org.au>
>     Signed-off-by: David S. Miller <davem@...emloft.net>
>
> commit 7f8be19f5a5737ce6ad670756183235c71b560bb
> Author: Herbert Xu <herbert@...dor.apana.org.au>
> Date:   Mon Apr 9 11:59:07 2007 -0700
> 
>     [NET]: Use csum_start offset instead of skb_transport_header
>     
>     The skb transport pointer is currently used to specify the start
>     of the checksum region for transmit checksum offload.  Unfortunately,
>     the same pointer is also used during receive side processing.
>     
>     This creates a problem when we want to retransmit a received
>     packet with partial checksums since the skb transport pointer
>     would be overwritten.
>     
>     This patch solves this problem by creating a new 16-bit csum_start
>     offset value to replace the skb transport header for the purpose
>     of checksums.  This offset is calculated from skb->head so that
>     it does not have to change when skb->data changes.
>     
>     No extra space is required since csum_offset itself fits within
>     a 16-bit word so we can use the other 16 bits for csum_start.
>     
>     For backwards compatibility, just before we push a packet with
>     partial checksums off into the device driver, we set the skb
>     transport header to what it would have been under the old scheme.
>     
>     Signed-off-by: Herbert Xu <herbert@...dor.apana.org.au>
>     Signed-off-by: David S. Miller <davem@...emloft.net>

Reverting both 8952d6c988ec31070732117f353666a4b9a09fea and
7f8be19f5a5737ce6ad670756183235c71b560bb fixes it.  Reverting only
7f8be19f5a5737ce6ad670756183235c71b560bb also fixes it.

Thanks.
-
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ