netdev - Re: Realtek 8139 problem on 486.

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <60BEA6CF.9080500@gmail.com>
Date:   Tue, 08 Jun 2021 02:07:59 +0300
From:   Nikolai Zhubr <zhubr.2@...il.com>
To:     Arnd Bergmann <arnd@...nel.org>
CC:     Heiner Kallweit <hkallweit1@...il.com>,
        netdev <netdev@...r.kernel.org>
Subject: Re: Realtek 8139 problem on 486.

Hi Arnd,

02.06.2021 12:12, Arnd Bergmann:
[...]
> I think the easiest workaround to address this reliably would be to move all
> the irq processing into the poll function. This way the interrupt is completely
> masked in the device until the poll handler finishes, and unmasking it
> while there
> are pending events would reliably trigger a new irq regardless of level or edge
> mode. Something like the untested change at https://pastebin.com/MhBJDt6Z .
> I don't know of other drivers that do it like this though, so I'm not
> sure if this causes a different set of problems.

I started applying your patch (trying to morph it a little bit so as to 
shove in a minimally invasive manner into 4.14) and then noticed that it 
probably won't work as intended. If I'm not mistaken this rx poll thing 
is only active within kind of "rx bursts", so it is not guaranteed to be 
continually running all the time when there is no or little rx input. 
I'd suppose some new additional work/thread would have to be introduced 
in order for such approach to be reliably implemented.

Meanwhile, beside the lost tx irq issue, I've apparently identified rx 
overrun issue. According to tinymembench, the raw RAM performance of 
this system is roughly around 15-30 Mbytes/s at best, so it is close to 
100Mbit wire speed. Tracing NFS over UDP operation (client side) I've 
found that of 2 full-sized incoming NFS/UDP packets the second one will 
always be lost, approved by rapid increase of iface err counter. More 
specifically, I've found that a couple of packets sized 1500+700 can 
still be successfully accepted, but no way 1500+1500. Apparently 8139 
has very little ram builtin so it needs that packets can go into main 
ram fast enough. It appeared though that just adding rsize=1024 allows 
NFS work quite well, with only ocasional small pauses. Also, apparenly 
TCP/IP somehow recovers/autotunes iteself automatically, so it just 
works fine. I suppose this overrun problem can not be fixed in a general 
form (other than forcing a downgrade of link speed to 10 Mbit), as AFAIK 
there are no provisions in ethernet to request e.g. extra delays between 
packets. What might be usefull though is dropping some line to dmesg 
suggesting to somehow limit the incoming flow. Such hint in dmesg would 
have saved me quite some time.

Anyway, for now I got it working quite well (with a re-added busy loop 
and rsize=1024). I'm going to look at the elcr_set_level_irq approach 
later, but it looks quite complicated. If there is something else I can 
test while at it, please let me know.

Thank you,

Regards,
Nikolai

>
>         Arnd
>