netdev - Re: [RFC PATCH v1] net: ethernet: nb8800: Reset HW block in ndo

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite for Android: free password hash cracker in your pocket

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <93bf2921-ac46-a11b-bd5d-d256123cc86a@free.fr>
Date:   Sun, 30 Jul 2017 00:48:59 +0200
From:   Mason <slash.tmp@...e.fr>
To:     Florian Fainelli <f.fainelli@...il.com>,
        Mans Rullgard <mans@...sr.com>
Cc:     Marc Gonzalez <marc_gonzalez@...madesigns.com>,
        "David S. Miller" <davem@...emloft.net>,
        netdev <netdev@...r.kernel.org>,
        Linux ARM <linux-arm-kernel@...ts.infradead.org>
Subject: Re: [RFC PATCH v1] net: ethernet: nb8800: Reset HW block in ndo_open

On 29/07/2017 22:15, Florian Fainelli wrote:

> On 07/29/2017 05:44 AM, Mason wrote:
>
>> We tested 4 switches, and DHCP failed on 3 of them.
>> Disabling pause frames "fixed" that.
> 
> OK, so it is this problem that you reported about before?

The "Ethernet flow control / pause frames" issue
is separate from the "link down wedges RX" issue.

We discussed the former back in November 2016:

https://www.mail-archive.com/netdev@vger.kernel.org/msg137094.html
https://patchwork.ozlabs.org/patch/694577/

Wait a second... I see that you and Mans had the
following exchange:

https://www.mail-archive.com/netdev@vger.kernel.org/msg138175.html

Mans mentions disabling DMA to be able to change
the flow control bits. The current theory is that
it is disabling DMA in ndo_stop that wedges RX.

So maybe the two issues are related after all...

I hate all these hardware quirks. Why can't HW
engineers make stuff that "just works"...

> Pause frames are tricky in that receiving pause frames means you
> should backpressure your transmitter and sending pause frames happens
> when your receiver cannot keep up. It is somewhat conceivable that
> your HW implementation is bogus and that you can get the HW in a
> state where it gets permanently backpressured for instance? And then
> only a full re-init would get you out of this stuck state presumably?
> Are there significant differences at the DMA/Ethernet controller
> level between Tango 3 (is that the one Mans worked on?) and Tango 4
> for instance that could explain a behavioral difference?

I'll have to take a look at the issue in light of
the new information. FWIW, Mans has tango3&4 boards.
I work on newer boards. The HW dev *swears* there
have been no functional differences in the eth block
"forever". However, bus accesses are faster in recent
chips, which could change who wins specific races.

Regards.