linux-kernel - Re: [PATCH net] r8152: Fix broken RX checksums.

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <1f847ae0-4928-01e7-f1e7-3cbc37529961@pobox.com>
Date:   Sun, 30 Oct 2016 22:07:25 -0400
From:   Mark Lord <mlord@...ox.com>
To:     David Miller <davem@...emloft.net>
Cc:     nic_swsd@...ltek.com, netdev@...r.kernel.org,
        linux-kernel@...r.kernel.org
Subject: Re: [PATCH net] r8152: Fix broken RX checksums.

On 16-10-30 08:57 PM, David Miller wrote:
> From: Mark Lord <mlord@...ox.com>
> Date: Sun, 30 Oct 2016 19:28:27 -0400
> 
>> The r8152 driver has been broken since (approx) 3.16.xx
>> when support was added for hardware RX checksums
>> on newer chip versions.  Symptoms include random
>> segfaults and silent data corruption over NFS.
>>
>> The hardware checksum logig does not work on the VER_02
>> dongles I have here when used with a slow embedded system CPU.
>> Google reveals others reporting similar issues on Raspberry Pi.
>>
>> So, disable hardware RX checksum support for VER_02, and fix
>> an obvious coding error for IPV6 checksums in the same function.
>>
>> Because this bug results in silent data corruption,
>> it is a good candidate for back-porting to -stable >= 3.16.xx.
>>
>> Signed-off-by: Mark Lord <mlord@...ox.com>
> 
> Applied and queued up for -stable, thanks.

Thanks.  Now that this is taken care of, I do wonder if perhaps
RX checksums ought to be enabled at all for ANY versions of this chip?

My theory is that the checksums probably work okay most of the time,
except when the hardware RX buffer overflows.

In my case, and in the case of the Raspberry Pi, the receiving CPU
is quite a bit slower than mainstream x86, so it can quite easily
fall behind in emptying the RX buffer on the chip.
The only indication this has happened may be an incorrect RX checksum.

This is only a theory, but I otherwise have trouble explaining
why we are seeing invalid RX checksums -- direct cable connections
to a switch, shared only with the NFS server.  No reason for it
to have bad RX checksums in the first place.

Should we just blanket disable RX checksums for all versions here
unless proven otherwise/safe?

Anyone out there know better?

Cheers
Mark