lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <d3fe52fa-ae42-f818-4494-8352323cc489@microchip.com>
Date:   Mon, 4 Jun 2018 17:05:24 +0200
From:   Nicolas Ferre <nicolas.ferre@...rochip.com>
To:     Jennifer Dahm <jennifer.dahm@...com>, <netdev@...r.kernel.org>,
        "David S . Miller" <davem@...emloft.net>
CC:     Nathan Sullivan <nathan.sullivan@...com>,
        Rafal Ozieblo <rafalo@...ence.com>,
        Claudiu Beznea <Claudiu.Beznea@...rochip.com>,
        Harini Katakam <harini.katakam@...inx.com>
Subject: Re: [RFC PATCH 0/2] net: macb: Disable TX checksum offloading on all
 Zynq

Jennifer,

On 25/05/2018 at 23:44, Jennifer Dahm wrote:
> During testing, I discovered that the Zynq GEM hardware overwrites all
> outgoing UDP packet checksums, which is illegal in packet forwarding
> cases. This happens both with and without the checksum-zeroing
> behavior  introduced  in  007e4ba3ee137f4700f39aa6dbaf01a71047c5f6
> ("net: macb: initialize checksum when using checksum offloading"). The
> only solution to both the small packet bug and the packet forwarding
> bug that I can find is to disable TX checksum offloading entirely.

Are the bugs listed above present in all revisions of the GEM IP, only 
for some revisions?
Is there an errata that describe this issue for the Zynq GEM?


> There's still the possibility that these bugs are actually with the
> driver software and not with the hardware. I've found several places
> where the checksum is set to 0xFFFF (the incorrect checksum found in
> small packets) when something goes wrong, and I can imagine a buggy
> driver writing over the checksum blindly when TX checksum offloading
> is enabled.
> 
> I would like feedback on two things:
> 1. Is it possible that the two bugs described above are caused by the
>     driver and not by the hardware? If so, where should I look to
>     implicate the driver?

Rafal,
Do you want to comment on this part?

> 2. Is this a problem we care enough about to completely disable TX
>     checksum offloading?

I would prefer to keep it enabled if possible. I mean I'm not against 
such a workaround if the HW is not able to handle such cases but 
offloading a CPU is always good when we have "low end" processors like 
ARM9 at 200MHz like the older AT91 that use this driver...

> Here is the testing procedure I used to reproduce these bugs on my
> machine. Specifically, without this patchset, step 9 fails. Without
> 007e4ba3ee, step 8 also fails.
> 
> 1. Set up the test environment:
>    a. Acquire a Zynq device with two ethernet ports. This is the DUT.
>    b. Acquire a USB-Ethernet adapter.
>    c. Acquire two ethernet cables.
>    d. Connect one Ethernet port on the DUT to your computer's network
>       switch.
>    e. Connect the other Ethernet port to the USB-Ethernet adapter and
>       plug that adapter into your computer.
>    f. Set up a Linux VM to send packets through the DUT. I recommend
>       using a VM here so that you can easily detach it from the primary
>       network to force outgoing traffic through the DUT.
>    g. Set up a computer with a packet inspecting program to receive and
>       inspect packets. This doesn't need to be a VM. For the purposes
>       of this test, I'll be using a Windows instance with WireShark.
> 2. Load the kernel you want to test onto the DUT, making sure to
>     include the `bridge` module.
> 3. Set up a bridge on the DUT. The following commands on the DUT
>     should work, replacing `eth0` and `eth1` with the two ethernet
>     interfaces on the DUT:
>     ```
>     brctl addbr test
>     brctl addif test eth0 eth1
>     ifconfig eth0 0.0.0.0
>     ifconfig eth1 0.0.0.0
>     dhclient test -v
>     ```
> 4. Disconnect the Linux VM from your host computer's network and
>     connect it to the USB-Ethernet adapter in order to force outgoing
>     network traffic through the DUT. If necessary, run dhclient on the
>     Linux VM to acquire an IP address.
> 5. Ensure that you can reach your Windows instance from your Linux VM
>     through the DUT (e.g. ping).
> 6. Start WireShark on your Windows instance and start monitoring
>     traffic on a specific, unused port (e.g. 61557).
> 7. Using netcat, send a few not-tiny UDP packets from your Linux VM to
>     your Windows instance to ensure that valid UDP packets are properly
>     forwarded. Ex:
>     ```
>     echo "hello world" | netcat -u <WindowsIP> 61557
>     ```
>     Inspect these packets to ensure that the data arrived intact and
>     that the checksum looks reasonable (i.e. not 0x0000 or 0xFFFF).
> 8. Using netcat, send a few tiny UDP packets (2 bytes or fewer) from
>     Linux VM to your Windows instance to ensure that the checksum is
>     reasonable. Ex:
>     ```
>     echo "h" | netcat -u <WindowsIP> 61557
>     ```
> 9. Using a custom program, send UDP packets with broken checksums
>     (e.g. 0xABCD) from your Linux VM to your Windows instance. Inspect
>     these packets with WireShark and make sure that the packet arrived
>     with the same checksum you sent it with.
> 
> For step 9, I wrote a C program using the Linux socket API that will
> send a properly formatted UDP packet with the payload "Hello!" and a
> (broken) checksum of 0xABCD to port 61557 on the host provided at the
> command line. I can send the full program if you would like, but here
> is the important part of it:
> ```
> struct custom_udp {
> 	int16_t s_port;
> 	int16_t d_port;
> 	int16_t length;
> 	int16_t check;
> 	char data[];
> };
> 
> int send_message(int sockfd, in_port_t port, const char *message) {
> 	struct custom_udp *frame;
> 	int16_t message_len;
> 	int16_t frame_len;
> 	int ret;
> 
> 	message_len = strlen(message) * sizeof(char);
> 	frame_len = sizeof(struct custom_udp) + message_len;
> 	frame = malloc(frame_len);
> 	frame->s_port = htons(0);
> 	frame->d_port = htons(port);
> 	frame->length = htons(frame_len);
> 	frame->check = htons(0xABCD);
> 	memmove(frame->data, message, message_len);
> 
> 	ret = write(sockfd, frame, frame_len);
> 	free(frame);
> 
> 	return ret;
> }

Thanks a lot for the detailed testing procedure. It will definitively 
help us reproduce the bug.

As a conclusion, I would say that we can use this patch as a last resort 
if we are sure that:
1/ the driver doesn't do weird things with TX CSUM in the code
2/ the hw is unable to handle these particular cases.


Best regards,
   Nicolas


> ```
> 
> Jennifer Dahm (1):
>    net/macb: Disable TX checksum offloading on all Zynq-7000
> 
>   drivers/net/ethernet/cadence/macb.h      |  1 +
>   drivers/net/ethernet/cadence/macb_main.c | 11 ++++++++---
>   2 files changed, 9 insertions(+), 3 deletions(-)
> 


-- 
Nicolas Ferre

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ