[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <60B560A8.8000800@gmail.com>
Date: Tue, 01 Jun 2021 01:18:16 +0300
From: Nikolai Zhubr <zhubr.2@...il.com>
To: Arnd Bergmann <arnd@...nel.org>
CC: netdev <netdev@...r.kernel.org>, Jeff Garzik <jgarzik@...ox.com>
Subject: Re: Realtek 8139 problem on 486.
Hi all,
Some more results follow. I'll report on all suggestions here in one go
for brevity.
> One possible issue is that the "RTL_W16 (IntrStatus, TxErr)" can
> leak out of the spinlock unless it is changed to RTL_W16_F(), but
> I don't see how that would cause your problem. This is probably
> not the issue here, but it can't hurt to change that. Similarly,
> the "RTL_W16 (IntrStatus, ackstat)" would need the same _F
> to ensure that a normal TX-only interrupt gets acked before the
> spinlock.
Just tested with "_F" added to all of them, did not help.
> Another observation I have is that the loop used to be around
> "RTL_R16(IntrStatus); rtl8139_rx(); rtl8139_tx_interrupt()", so
> removing the loop also means that the tx handler is only called
> once when it used to be called for every loop iteration.
> If this is what triggers the problem, you should be able to break
> it the same way by moving the rtl8139_tx_interrupt() ahead of the
> loop, and adjusting the RTL_W16 (IntrStatus, ackstat) accordingly
> so you only Ack the TX before calling rtl8139_tx_interrupt().
I get the idea in general, but not sure how exactly you proposed to move
rtl8139_tx_interrupt() and adjust the RTL_W16 (IntrStatus, ackstat).
But meanwhile, I tried a dumb thing instead, and it worked!
I've put back The Loop:
---------------------------
+ int boguscnt = 20;
spin_lock (&tp->lock);
+ do {
status = RTL_R16 (IntrStatus);
/* shared irq? */
@@ -2181,6 +2183,8 @@
if (status & TxErr)
RTL_W16 (IntrStatus, TxErr);
}
+ boguscnt--;
+ } while (boguscnt > 0);
out:
---------------------------
With this added, connection works fine again. Of course it is silly, but
hopefully it gives a path for a real fix.
> What's your qdisc? Recently there was a bug related to the lockless
> pfifo_fast qdisc
If I understand correctly this means packet scheduler type. In more
recent kernels I typically have CONFIG_DEFAULT_NET_SCH="fq_codel", now
in 2.6.3 no explicite scheduler is enabled, so it must be some fast
fifo. But as the sympthoms were basically identical in e.g. 2.6.3 and
4.14, I suppose it is unlikely to be the cause.
> Issue could be related to rx and tx processing now potentially running in parallel.
> I only have access to the current 8139too source code, hopefully the following
> works on the old version:
>
> In the end of rtl8139_start_xmit() there's
> if ((tp->cur_tx - NUM_TX_DESC) == tp->dirty_tx)
> netif_stop_queue (dev);
>
> Try changing this to
Ok, the changes compiled fine, but unfortunately made no noticable
difference.
Thank you,
Regards,
Nikolai
Powered by blists - more mailing lists