lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Date:   Thu, 24 Nov 2016 07:31:17 -0500
From:   Mark Lord <mlord@...ox.com>
To:     Hayes Wang <hayeswang@...ltek.com>,
        "netdev@...r.kernel.org" <netdev@...r.kernel.org>
Cc:     nic_swsd <nic_swsd@...ltek.com>,
        "linux-kernel@...r.kernel.org" <linux-kernel@...r.kernel.org>,
        "linux-usb@...r.kernel.org" <linux-usb@...r.kernel.org>
Subject: Re: [PATCH net 1/2] r8152: fix the sw rx checksum is unavailable

On 16-11-23 02:29 PM, Mark Lord wrote:
> On 16-11-23 10:12 AM, Hayes Wang wrote:
>> Mark Lord [mlord@...ox.com]
>> [...]
>>> What does this code do:
>>
>>>> static void r8153_set_rx_early_size(struct r8152 *tp)
>>>> {
>>>>        u32 mtu = tp->netdev->mtu;
>>>>        u32 ocp_data = (agg_buf_sz - mtu - VLAN_ETH_HLEN - VLAN_HLEN) / 4;
>>>>
>>>>        ocp_write_word(tp, MCU_TYPE_USB, USB_RX_EARLY_SIZE, ocp_data);
>>>> }
>>
>> This only works for RTL8153. However, what you use is RTL8152.
>> It is like delay completion. It is used to reduce the loading of CPU
>> by letting a transfer contain more data to reduce the number of
>> transfers.
>>
>>> How is ocp_data used by the hardware?
>>> Shouldn't the calculation also include sizeof(rx_desc) in there somewhere?
>>
>> The algorithm is from our hw engineers, and it should be
>>
>>    (agg_buf_sz - packet size) / 8
>>
>> You could refer to commit a59e6d815226 ("r8152: correct the rx early size").
>
> Thanks.
>
> Right now I am working quite hard trying to narrow things down exactly.
> You are correct that the driver does appear to be careful about accesses
> beyond the filled portion of a URB buffer -- for some reason I thought
> the original driver had issues there, but looking again it does not seem to.
>
> One idea that is now looking more likely:
> Things could be suffering from speculative CPU accesses to RAM
> (the system here has non-coherent d-cache/RAM).
> This could incorrectly pre-load data from adjacent URB buffers
> into the d-cache, creating coherency issues.  I am testing now
> with cacheline-sized guard zones between the buffers to see if
> that is the issue or not.

Nope.  Guard zones did not fix it, so it's probably not a prefetch issue.
Oddly, adding a couple of memory barriers to specific places in the driver
does help, A LOT.  Still not 100%, but it did pass 1800 reboot tests over night
with only three bad rx_desc's reported.

That's a new record here for the driver using kmalloc'd buffers,
and put reliability on par with using non-cacheable buffers.

Any way we look at it though, the chip/driver are simply unreliable,
and relying upon hardware checksums (which fail due to the driver
looking at garbage rather than the checksum bits) leads to data corruption.

Cheers



Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ