lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Date:	Tue, 28 Oct 2014 08:36:11 -0700
From:	Tom Herbert <therbert@...gle.com>
To:	Or Gerlitz <ogerlitz@...lanox.com>
Cc:	"netdev@...r.kernel.org" <netdev@...r.kernel.org>,
	John Fastabend <john.r.fastabend@...el.com>,
	Jeff Kirsher <jeffrey.t.kirsher@...el.com>
Subject: Re: some failures with vxlan offloads..

On Tue, Oct 28, 2014 at 8:27 AM, Or Gerlitz <ogerlitz@...lanox.com> wrote:
> On 10/27/2014 3:23 AM, Tom Herbert wrote:
>>
>> On Sun, Oct 26, 2014 at 3:23 PM, Or Gerlitz <gerlitz.or@...il.com> wrote:
>>>
>>> On Sun, Oct 26, 2014 at 5:29 PM, Tom Herbert <therbert@...gle.com> wrote:
>>>>
>>>> Can you determine what the TSO HW engine is setting in UDP checksum
>>>> field? tcpdump -vv might be able to show this. The symptoms seem to
>>>> indicate that it may not be zero.
>>>
>>> Thanks for the quick response. I'll check what is placed in the UDP
>>> checksum field for packets that went through the offloading HW and let
>>> you know.
>>>
>>> BTW, if following the direction you proposed, I wonder why this works
>>> (e.g the kernel doesn't drops the encapsulated TCP packets) when both
>>> sides are offloaded?
>>>
>> I'm just speculating, but the device may be returning checksum unnecessary
>> for the UDP checksum without actually checking it. Technically, VXLAN
>> RFC7348 allows an implementation to ignore the UDP checksum, although this
>> clearly violates RFC1122 UDP checksum
>> requirements. In the stack we now checksum all non-zero checksums
>> including UDP checksum in VXLAN if it's not marked checksum-unnecessary.
>
>
> OK, I found something (it's always bad habit to try and potentially blame
> someone else for your bugs...) -- as I wrote here earlier, the current HW
> doesn't support checksum generation for both the inner (say TCP) and outer
> (UDP) packet (and indeed we don't advertize SKB_GSO_UDP_TUNNEL_CSUM).
>
> So if we tell them to offload the inner TCP checksum we must **not** tell
> them to attempt and offload the outer checksum too, and I wrongly did
> that... once I stopped doing so, I get mixed configurations (one side
> offloaded the peer not offloaded) to work. I will submit mlx4 fix for that.
>
> I wonder if we have another bug somewhere... when both sides were offloaded,
> it works even with the mlx4 bug, canyou explain that?is it possible that the
> GRO stack somehow covers on the bug when both sides are offloaded and
> GRO/VXLAN comes into play?
>
Look at the receive side. As I mentioned, if the device is returning
checksum-unnecessary and setting csum_level to 1 (inner checksum was
validated) then stack won't try to verify the outer checksum. So in
this case if outer checksum is incorrect nobody complains about it.


> Or.
>
> after the fix, packets sent by the offloaded side (192.168.31.17) carry zero
> udpchecksum
>
> 17:20:44.445866 IP (tos 0x0, ttl 64, id 61275, offset 0, flags [DF], proto
> UDP (17), length 1500)
>     192.168.31.17.45387 > 192.168.31.18.4789: [no cksum] UDP, length 1472
>         0x0000:  f452 1401 da82 0002 c9e9 bf32 0800 4500
>         0x0010:  05dc ef5b 4000 4011 8641 c0a8 1f11 c0a8
>         0x0020:  1f12 b14b 12b5 05c8 0000 0800 0000 0000
>         0x0030:  6300 b2c7 81db e850 7a83 2ecb 8c68 0800
>         0x0040:  4500 05aa bb84 4000 4006 9055 c0a8 3411
>         0x0050:  c0a8 3412 e479 9116 553a e008 f28e 6268
>         0x0060:  5010 0038 88e3 0000 6600 6e65 7470 6572
>         0x0070:  6600 6e65 7470 6572 6600
> 17:20:44.445871 IP (tos 0x0, ttl 64, id 61276, offset 0, flags [DF], proto
> UDP (17), length 1500)
>     192.168.31.17.45387 > 192.168.31.18.4789: [no cksum] UDP, length 1472
>         0x0000:  f452 1401 da82 0002 c9e9 bf32 0800 4500
>         0x0010:  05dc ef5c 4000 4011 8640 c0a8 1f11 c0a8
>         0x0020:  1f12 b14b 12b5 05c8 0000 0800 0000 0000
>         0x0030:  6300 b2c7 81db e850 7a83 2ecb 8c68 0800
>         0x0040:  4500 05aa bb85 4000 4006 9054 c0a8 3411
>         0x0050:  c0a8 3412 e479 9116 553a e58a f28e 6268
>         0x0060:  5010 0038 7afc 0000 6e65 7470 6572 6600
>         0x0070:  6e65 7470 6572 6600 6e65
>
>
> before the fix, packets sent by the offloaded side (192.168.31.17) carry
> junkudpchecksum
>
> Also note that on one of the packets sent by the offloaded part, we don't
> see the "bad udp cksum" scream from tcpdump, which is weird...
>
> 17:03:08.765845 IP (tos 0x0, ttl 64, id 52396, offset 0, flags [DF], proto
> UDP (17), length 746)
>     192.168.31.17.56686 > 192.168.31.18.4789: UDP, length 718
>         0x0000:  f452 1401 da82 0002 c9e9 bf32 0800 4500
>         0x0010:  02ea ccac 4000 4011 abe2 c0a8 1f11 c0a8
>         0x0020:  1f12 dd6e 12b5 02d6 0c1b 0800 0000 0000
>         0x0030:  6300 b2c7 81db e850 7a83 2ecb 8c68 0800
>         0x0040:  4500 02b8 6357 4000 4006 eb74 c0a8 3411
>         0x0050:  c0a8 3412 86f2 3241 d67e f47d e5a9 d041
>         0x0060:  5018 0038 871c 0000 0000 0258 ffff ffff
>         0x0070:  0000 0000 0000 0000 0000
> 17:03:09.336285 IP (tos 0x0, ttl 64, id 52536, offset 0, flags [DF], proto
> UDP (17), length 90)
>     192.168.31.17.56686 > 192.168.31.18.4789: [bad udp cksum 1360!] UDP,
> length 62
>         0x0000:  f452 1401 da82 0002 c9e9 bf32 0800 4500
>         0x0010:  005a cd38 4000 4011 ade6 c0a8 1f11 c0a8
>         0x0020:  1f12 dd6e 12b5 0046 139a 0800 0000 0000
>         0x0030:  6300 b2c7 81db e850 7a83 2ecb 8c68 0800
>         0x0040:  4500 0028 6358 4000 4006 ee03 c0a8 3411
>         0x0050:  c0a8 3412 86f2 3241 d67e f70d e5a9 d041
>         0x0060:  5011 0038 897b 0000
> 17:03:10.335074 IP (tos 0x0, ttl 64, id 40045, offset 0, flags [DF], proto
> UDP (17), length 98)
>     192.168.31.18.48861 > 192.168.31.17.4789: [no cksum] UDP, length 70
>         0x0000:  0002 c9e9 bf32 f452 1401 da82 0800 4500
>         0x0010:  0062 9c6d 4000 4011 dea9 c0a8 1f12 c0a8
>         0x0020:  1f11 bedd 12b5 004e 0000 0800 0000 0000
>         0x0030:  6300 7a83 2ecb 8c68 b2c7 81db e850 0800
>         0x0040:  4500 0030 0000 4000 4006 5154 c0a8 3412
>         0x0050:  c0a8 3411 3241 86f2 e5a9 d040 d67e f47d
>         0x0060:  7012 6e28 f282 0000 0204 0582 0103 0307
> 17:03:10.335110 IP (tos 0x0, ttl 64, id 52764, offset 0, flags [DF], proto
> UDP (17), length 90)
>     192.168.31.17.56686 > 192.168.31.18.4789: [bad udp cksum 1360!] UDP,
> length 62
>         0x0000:  f452 1401 da82 0002 c9e9 bf32 0800 4500
>         0x0010:  005a ce1c 4000 4011 ad02 c0a8 1f11 c0a8
>         0x0020:  1f12 dd6e 12b5 0046 139a 0800 0000 0000
>         0x0030:  6300 b2c7 81db e850 7a83 2ecb 8c68 0800
>         0x0040:  4500 0028 6359 4000 4006 ee02 c0a8 3411
>         0x0050:  c0a8 3412 86f2 3241 d67e f70e e5a9 d041
>         0x0060:  5010 0038 897b 0000
>
>
--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ