netdev - Re: Bug#565404: linux-image-2.6.26-2-amd64: atl1e: TSO is broken

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <1264260580.373.77.camel@localhost>
Date:	Sat, 23 Jan 2010 15:29:40 +0000
From:	Ben Hutchings <ben@...adent.org.uk>
To:	Anders Boström <anders@...insight.net>
Cc:	Jie.Yang@...eros.com, netdev@...r.kernel.org,
	565404@...s.debian.org, Xiong.Huang@...eros.com
Subject: Re: Bug#565404: linux-image-2.6.26-2-amd64: atl1e: TSO is broken

On Thu, 2010-01-21 at 17:42 +0100, Anders Boström wrote:
> >>>>> "JY" == Jie Yang <Jie.Yang@...eros.com> writes:
> 
>  >> Have you tested NFS over TCP? The block-size the application
>  >> uses can have an effect on this. What application did you
>  >> use? Block-size?
>  >> 
>  JY> yes, I tested NFS over TCP.
> 
> One strange observation is that I can only reproduce this problem when
> transmitting data from a NFS-server using TCP with Atheros
> AR8121/AR8113/AR8114.
> 
> I've tried to reproduce the problem using test-programs, like nttcp
> and netpipe, without any success. One observation is that the
> test-programs *only* generates 1500 bytes IP-packets. When
> the NFS-server sends data, a sequence of 1500 bytes IP-packets are
> generated, ending with a shorter packet. And this last packet in the
> sequence has 1500 in the IP-header length field, but is shorter.

I ran tcpdump over your packet capture and saw:

13:48:39.122723 00:26:18:ae:69:6d > 00:18:f3:52:22:3f, ethertype IPv4 (0x0800), length 1514: (tos 0x0, ttl 64, id 32664, offset 0, flags [DF], proto TCP (6), length 1500)
    10.100.0.88.2049 > 10.100.1.25.888: Flags [.], cksum 0x3ebd (correct), seq 21720:23168, ack 157, win 501, options [nop,nop,TS val 152460082 ecr 1212787170], length 1448
13:48:39.122733 00:18:f3:52:22:3f > 00:26:18:ae:69:6d, ethertype IPv4 (0x0800), length 66: (tos 0x0, ttl 64, id 39773, offset 0, flags [DF], proto TCP (6), length 52)
    10.100.1.25.888 > 10.100.0.88.2049: Flags [.], cksum 0x5cfc (correct), ack 23168, win 58293, options [nop,nop,TS val 1212787170 ecr 152460082], length 0
13:48:39.122742 00:26:18:ae:69:6d > 00:18:f3:52:22:3f, ethertype IPv4 (0x0800), length 1462: truncated-ip - 52 bytes missing! (tos 0x0, ttl 64, id 32664, offset 0, flags [DF], proto TCP (6), length 1500)
    10.100.0.88.2049 > 10.100.1.25.888: Flags [.], seq 23168:24616, ack 157, win 501, options [nop,nop,TS val 152460082 ecr 1212787170], length 1448
13:48:39.122747 00:26:18:ae:69:6d > 00:18:f3:52:22:3f, ethertype IPv4 (0x0800), length 1514: (tos 0x0, ttl 64, id 32666, offset 0, flags [DF], proto TCP (6), length 1500)
    10.100.0.88.2049 > 10.100.1.25.888: Flags [.], cksum 0x33a1 (correct), seq 24564:26012, ack 157, win 501, options [nop,nop,TS val 152460082 ecr 1212787170], length 1448

Based on the TCP sequence numbers, it seems that the length of the
broken packet is correct but its IP header is wrong.

My understanding is that the length of the TCP payload in a GSO skb must
always be a multiple of the gso_size, so that hardware is not required
to adjust length fields.  So I see several possible explanations:

1. Something generated invalid GSO skbs (unlikely; other hardware should
show the same problem)
2. The driver constructed TSO DMA descriptors for a non-GSO skb
3. The hardware is continuing to apply TSO to packets with non-TSO DMA
descriptors

Ben.

-- 
Ben Hutchings
Any smoothly functioning technology is indistinguishable from a rigged demo.

Download attachment "signature.asc" of type "application/pgp-signature" (829 bytes)