[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <49f40dd8-da68-f579-b359-7a7e229565e1@gmail.com>
Date: Tue, 1 Jun 2021 00:30:19 +0200
From: Heiner Kallweit <hkallweit1@...il.com>
To: Nikolai Zhubr <zhubr.2@...il.com>, Arnd Bergmann <arnd@...nel.org>
Cc: netdev <netdev@...r.kernel.org>, Jeff Garzik <jgarzik@...ox.com>
Subject: Re: Realtek 8139 problem on 486.
On 01.06.2021 00:18, Nikolai Zhubr wrote:
> Hi all,
>
> Some more results follow. I'll report on all suggestions here in one go for brevity.
>
>> One possible issue is that the "RTL_W16 (IntrStatus, TxErr)" can
>> leak out of the spinlock unless it is changed to RTL_W16_F(), but
>> I don't see how that would cause your problem. This is probably
>> not the issue here, but it can't hurt to change that. Similarly,
>> the "RTL_W16 (IntrStatus, ackstat)" would need the same _F
>> to ensure that a normal TX-only interrupt gets acked before the
>> spinlock.
>
> Just tested with "_F" added to all of them, did not help.
>
>> Another observation I have is that the loop used to be around
>> "RTL_R16(IntrStatus); rtl8139_rx(); rtl8139_tx_interrupt()", so
>> removing the loop also means that the tx handler is only called
>> once when it used to be called for every loop iteration.
>> If this is what triggers the problem, you should be able to break
>> it the same way by moving the rtl8139_tx_interrupt() ahead of the
>> loop, and adjusting the RTL_W16 (IntrStatus, ackstat) accordingly
>> so you only Ack the TX before calling rtl8139_tx_interrupt().
>
> I get the idea in general, but not sure how exactly you proposed to move rtl8139_tx_interrupt() and adjust the RTL_W16 (IntrStatus, ackstat).
> But meanwhile, I tried a dumb thing instead, and it worked!
> I've put back The Loop:
> ---------------------------
> + int boguscnt = 20;
>
> spin_lock (&tp->lock);
> + do {
> status = RTL_R16 (IntrStatus);
>
> /* shared irq? */
> @@ -2181,6 +2183,8 @@
> if (status & TxErr)
> RTL_W16 (IntrStatus, TxErr);
> }
> + boguscnt--;
> + } while (boguscnt > 0);
> out:
> ---------------------------
> With this added, connection works fine again. Of course it is silly, but hopefully it gives a path for a real fix.
>
What was discussed here 16 yrs ago should sound familiar to you.
https://www.mail-archive.com/linux-kernel@vger.kernel.org/msg92234.html
"It was an option in my BIOS PCI level/edge settings as I posted."
You could check whether you have same/similar option in your BIOS
and play with it.
>> What's your qdisc? Recently there was a bug related to the lockless
>> pfifo_fast qdisc
>
> If I understand correctly this means packet scheduler type. In more recent kernels I typically have CONFIG_DEFAULT_NET_SCH="fq_codel", now in 2.6.3 no explicite scheduler is enabled, so it must be some fast fifo. But as the sympthoms were basically identical in e.g. 2.6.3 and 4.14, I suppose it is unlikely to be the cause.
>
>> Issue could be related to rx and tx processing now potentially running in parallel.
>> I only have access to the current 8139too source code, hopefully the following
>> works on the old version:
>>
>> In the end of rtl8139_start_xmit() there's
>> if ((tp->cur_tx - NUM_TX_DESC) == tp->dirty_tx)
>> netif_stop_queue (dev);
>>
>> Try changing this to
>
> Ok, the changes compiled fine, but unfortunately made no noticable difference.
>
>
> Thank you,
>
> Regards,
> Nikolai
>
>
Powered by blists - more mailing lists