[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <1271091581.16881.41.camel@edumazet-laptop>
Date: Mon, 12 Apr 2010 18:59:41 +0200
From: Eric Dumazet <eric.dumazet@...il.com>
To: stephen mulcahy <smulcahy@...il.com>
Cc: netdev <netdev@...r.kernel.org>,
Ben Hutchings <ben@...adent.org.uk>,
Ayaz Abdulla <aabdulla@...dia.com>, 572201@...s.debian.org
Subject: Re: forcedeth driver hangs under heavy load
Le lundi 12 avril 2010 à 17:11 +0100, stephen mulcahy a écrit :
> Eric Dumazet wrote:
> > Le lundi 12 avril 2010 à 14:19 +0100, stephen mulcahy a écrit :
> >
> > Do you have some netfilters rules ?
> >
>
> Hi Eric,
>
> I don't have any netfilters rules:
>
> root@...e34:~# for table in filter nat mangle raw; do iptables -t $table
> -L; done
> Chain INPUT (policy ACCEPT)
> target prot opt source destination
>
> Chain FORWARD (policy ACCEPT)
> target prot opt source destination
>
> Chain OUTPUT (policy ACCEPT)
> target prot opt source destination
> Chain PREROUTING (policy ACCEPT)
> target prot opt source destination
>
> Chain POSTROUTING (policy ACCEPT)
> target prot opt source destination
>
> Chain OUTPUT (policy ACCEPT)
> target prot opt source destination
> Chain PREROUTING (policy ACCEPT)
> target prot opt source destination
>
> Chain INPUT (policy ACCEPT)
> target prot opt source destination
>
> Chain FORWARD (policy ACCEPT)
> target prot opt source destination
>
> Chain OUTPUT (policy ACCEPT)
> target prot opt source destination
>
> Chain POSTROUTING (policy ACCEPT)
> target prot opt source destination
> Chain PREROUTING (policy ACCEPT)
> target prot opt source destination
>
> Chain OUTPUT (policy ACCEPT)
> target prot opt source destination
>
>
> I re-ran this on the 2.6.32 kernel (with the 2.6.32 forcedeth module)
> just in case that was screwing something up.
>
> node33 is in the unresponsive state this time. I'm running tcpdump on
> node34. on node33 I try to ssh to node34 (using ip address of node34). I
> note that I can ping between node33 and node34.
>
> root@...e34:~# tcpdump -v host node34 and node33
> tcpdump: listening on eth0, link-type EN10MB (Ethernet), capture size 96
> bytes
> 17:05:19.622384 IP (tos 0x0, ttl 64, id 21435, offset 0, flags [DF],
> proto TCP (6), length 60)
> node33.webstar.cnet.43653 > node34.ssh: Flags [S], cksum 0xb994
> (correct), seq 1675314077, win 5840, options [mss 1460,sackOK,TS val
> 331814 ecr 0,nop,wscale 7], length 0
> 17:05:19.622754 IP (tos 0x0, ttl 64, id 0, offset 0, flags [DF], proto
> TCP (6), length 60)
> node34.ssh > node33.webstar.cnet.43653: Flags [S.], cksum 0x9d81
> (correct), seq 1669769379, ack 1675314078, win 5792, options [mss
> 1460,sackOK,TS val 331779 ecr 331814,nop,wscale 7], length 0
> 17:05:19.622813 IP (tos 0x0, ttl 64, id 21436, offset 0, flags [DF],
> proto TCP (6), length 52)
> node33.webstar.cnet.43653 > node34.ssh: Flags [.], cksum 0xe2bf
> (correct), ack 1, win 46, options [nop,nop,TS val 331814 ecr 331779],
> length 0
> 17:05:19.627666 IP (tos 0x0, ttl 64, id 47271, offset 0, flags [DF],
> proto TCP (6), length 84)
> node34.ssh > node33.webstar.cnet.43653: Flags [P.], seq 1:33, ack
> 1, win 46, options [nop,nop,TS val 331780 ecr 331814], length 32
> 17:05:19.627748 IP (tos 0x0, ttl 64, id 21437, offset 0, flags [DF],
> proto TCP (6), length 52)
> node33.webstar.cnet.43653 > node34.ssh: Flags [.], cksum 0xe29c
> (correct), ack 33, win 46, options [nop,nop,TS val 331816 ecr 331780],
> length 0
> 17:05:19.627833 IP (tos 0x0, ttl 64, id 21438, offset 0, flags [DF],
> proto TCP (6), length 84, bad cksum 1f8a (->d189)!)
> node33.webstar.cnet.43653 > node34.ssh: Flags [P.], seq
> 23413:23445, ack 2749038625, win 46, options [nop,nop,TS val 331816 ecr
> 331780], length 32
> 17:05:19.831634 IP (tos 0x0, ttl 64, id 21439, offset 0, flags [DF],
> proto TCP (6), length 84, bad cksum d189 (->d188)!)
> node33.webstar.cnet.43653 > node34.ssh: Flags [P.], seq 1:33, ack
> 33, win 46, options [nop,nop,TS val 331867 ecr 331780], length 32
> 17:05:20.239603 IP (tos 0x0, ttl 64, id 21440, offset 0, flags [DF],
> proto TCP (6), length 84, bad cksum 15c6 (->d187)!)
> node33.webstar.cnet.43653 > node34.ssh: Flags [P.], seq
> 30492:30524, ack 809893921, win 46, options [nop,nop,TS val 331969 ecr
> 331780], length 32
> 17:05:21.055534 IP (tos 0x0, ttl 64, id 21441, offset 0, flags [DF],
> proto TCP (6), length 84, bad cksum d187 (->d186)!)
> node33.webstar.cnet.43653 > node34.ssh: Flags [P.], seq 1:33, ack
> 33, win 46, options [nop,nop,TS val 332173 ecr 331780], length 32
> 17:05:22.687386 IP (tos 0x0, ttl 64, id 21442, offset 0, flags [DF],
> proto TCP (6), length 84, bad cksum d186 (->d185)!)
> node33.webstar.cnet.43653 > node34.ssh: Flags [P.], seq 1:33, ack
> 33, win 46, options [nop,nop,TS val 332581 ecr 331780], length 32
> 17:05:25.950935 IP (tos 0x0, ttl 64, id 21443, offset 0, flags [DF],
> proto TCP (6), length 84, bad cksum 15c4 (->d184)!)
> node33.webstar.cnet.43653 > node34.ssh: Flags [P.], seq
> 30492:30524, ack 809893921, win 46, options [nop,nop,TS val 333397 ecr
> 331780], length 32
> 17:05:32.478527 IP (tos 0x0, ttl 64, id 21444, offset 0, flags [DF],
> proto TCP (6), length 84, bad cksum c01 (->d183)!)
> node33.webstar.cnet.43653 > node34.ssh: Flags [P.], seq
> 43997:44029, ack 1311047713, win 46, options [nop,nop,TS val 335029 ecr
> 331780], length 32
> 17:05:45.533370 IP (tos 0x0, ttl 64, id 21445, offset 0, flags [DF],
> proto TCP (6), length 84, bad cksum 23d (->d182)!)
> node33.webstar.cnet.43653 > node34.ssh: Flags [P.], seq 3348:3380,
> ack 4054450209, win 46, options [nop,nop,TS val 338293 ecr 331780],
> length 32
> 17:06:08.719187 IP (tos 0x0, ttl 64, id 27660, offset 0, flags [DF],
> proto TCP (6), length 1500, bad cksum 5360 (->b3b3)!)
> node33.webstar.cnet.50060 > node34.35725: Flags [.], seq
> 1203473738:1203475186, ack 1191452767, win 54, options [nop,nop,TS val
> 344089 ecr 256770], length 1448
> 17:06:11.643080 IP (tos 0x0, ttl 64, id 21446, offset 0, flags [DF],
> proto TCP (6), length 84, bad cksum e4f2 (->d181)!)
> node33.webstar.cnet.43653 > node34.ssh: Flags [P.], seq
> 47331:47363, ack 4110811169, win 46, options [nop,nop,TS val 344821 ecr
> 331780], length 32
> 17:06:13.715233 ARP, Ethernet (len 6), IPv4 (len 4), Request who-has
> node34 tell node33.webstar.cnet, length 46
> 17:06:13.715257 ARP, Ethernet (len 6), IPv4 (len 4), Reply node34 is-at
> 00:30:48:f0:06:72 (oui Unknown), length 28
> 17:07:03.866492 IP (tos 0x0, ttl 64, id 21447, offset 0, flags [DF],
> proto TCP (6), length 84, bad cksum b413 (->d180)!)
> node33.webstar.cnet.43653 > node34.ssh: Flags [P.], seq
> 28939:28971, ack 1913782305, win 46, options [nop,nop,TS val 357877 ecr
> 331780], length 32
> 17:07:08.862055 ARP, Ethernet (len 6), IPv4 (len 4), Request who-has
> node34 tell node33.webstar.cnet, length 46
> 17:07:08.862370 ARP, Ethernet (len 6), IPv4 (len 4), Reply node34 is-at
> 00:30:48:f0:06:72 (oui Unknown), length 28
> 17:07:19.627910 IP (tos 0x0, ttl 64, id 47272, offset 0, flags [DF],
> proto TCP (6), length 52)
> node34.ssh > node33.webstar.cnet.43653: Flags [F.], cksum 0x6d6b
> (correct), seq 33, ack 1, win 46, options [nop,nop,TS val 361780 ecr
> 331816], length 0
> 17:07:19.628403 IP (tos 0x0, ttl 64, id 21448, offset 0, flags [DF],
> proto TCP (6), length 844, bad cksum aa4d (->ce87)!)
> node33.webstar.cnet.43653 > node34.ssh: Flags [FP.], seq
> 20399:21191, ack 2356871202, win 46, options [nop,nop,TS val 361818 ecr
> 361780], length 792
> 17:07:19.833456 IP (tos 0x0, ttl 64, id 47273, offset 0, flags [DF],
> proto TCP (6), length 52)
> node34.ssh > node33.webstar.cnet.43653: Flags [F.], cksum 0x6d37
> (correct), seq 33, ack 1, win 46, options [nop,nop,TS val 361832 ecr
> 331816], length 0
> 17:07:19.833517 IP (tos 0x0, ttl 64, id 21449, offset 0, flags [DF],
> proto TCP (6), length 64)
> node33.webstar.cnet.43653 > node34.ssh: Flags [.], cksum 0xa5e9
> (correct), ack 34, win 46, options [nop,nop,TS val 361870 ecr
> 361832,nop,nop,sack 1 {33:34}], length 0
>
> At this point, I see a "Connection closed by 10.141.0.34" message on
> node33 (from where I am attempting to ssh).
>
> Again, if I ifdown on node33 and ifup again - I can then see from node33
> to node34 without problems.
>
OK it seems forcedeth has problem with checksums ?
Try to change "ethtool -k eth0" settings ?
ethtool -K eth0 tso off tx off
--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Powered by blists - more mailing lists