[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <51830501a2c5969806e418b8843183f5@wizardsworks.org>
Date: Thu, 19 Jun 2025 16:32:08 -0700
From: Greg Chandler <chandleg@...ardsworks.org>
To: "Maciej W. Rozycki" <macro@...am.me.uk>
Cc: "Maciej W. Rozycki" <macro@...am.me.uk>,
Florian Fainelli
<f.fainelli@...il.com>, stable@...r.kernel.org,
netdev@...r.kernel.org
Subject: Re: Tulip 21142 panic on physical link disconnect
On 2025/06/19 15:56, Greg Chandler wrote:
> On 2025/06/19 14:53, Maciej W. Rozycki wrote:
>> On Thu, 19 Jun 2025, Florian Fainelli wrote:
>>
>>> > Maybe it'll ring someone's bell and they'll chime in or otherwise I'll
>>> > bisect it... sometime. Or feel free to start yourself with 5.18, as it's
>>> > not terribly old, only a bit and certainly not so as 2.6 is.
>>>
>>> I am still not sure why I could not see that warning on by Cobalt
>>> Qube2 trying
>>> to reproduce Greg's original issue, that is with an IP assigned on
>>> the
>>> interface yanking the cable did not trigger a timer warning. It could
>>> be that
>>> machine is orders of magnitude slower and has a different CONFIG_HZ
>>> value that
>>> just made it less likely to be seen?
>>
>> Can it have a different PHY attached? There's this code:
>>
>> if (tp->chip_id == PNIC2)
>> tp->link_change = pnic2_lnk_change;
>> else if (tp->flags & HAS_NWAY)
>> tp->link_change = t21142_lnk_change;
>> else if (tp->flags & HAS_PNICNWAY)
>> tp->link_change = pnic_lnk_change;
>>
>> in `tulip_init_one' and `pnic_lnk_change' won't ever trigger this, but
>> the
>> other two can; apparently the corresponding comment in
>> `tulip_interrupt':
>>
>> /*
>> * NB: t21142_lnk_change() does a del_timer_sync(), so be careful if
>> this
>> * call is ever done under the spinlock
>> */
>>
>> hasn't been updated when `pnic2_lnk_change' was added. Also ISTM no
>> link
>> change handler is a valid option too, in which case `del_timer_sync'
>> won't
>> be called either. This is from a cursory glance only, so please take
>> with
>> a pinch of salt.
>>
>> Maciej
>
>
>
>
> I'm not sure which of us that was directed at, but for my onboard
> tulips:
>
> Micro Linear ML6698CH <- PHY
> Intel 21143-TD <- NIC
>
> I know that the ML chips are most commonly used with 21143s and a very
> small smattering of others, I don't think they are all that common at
> least not since the late '90s..
> I'm relatively certain all my DEC ISA/PCI nics use them though.
>
> I found a link to the datasheet (If needed), but have had mixed luck
> with alldatasheets:
> https://www.alldatasheet.com/datasheet-pdf/pdf/75840/MICRO-LINEAR/ML6698CH.html
>
> Glancing over it I don't see anything about the link, I'll go stick my
> eyes in the driver a bit and see what stabs me in the eye....
That didn't take long.. The first thing to jab it's thumb in my eye was
this:
const struct tulip_chip_table tulip_tbl[] = {
{ }, /* placeholder for array, slot unused currently */
{ }, /* placeholder for array, slot unused currently */
/* DC21140 */
{ "Digital DS21140 Tulip", 128, 0x0001ebef,
HAS_MII | HAS_MEDIA_TABLE | CSR12_IN_SROM | HAS_PCI_MWI,
tulip_timer,
tulip_media_task },
/* DC21142, DC21143 */
{ "Digital DS21142/43 Tulip", 128, 0x0801fbff,
HAS_MII | HAS_MEDIA_TABLE | ALWAYS_CHECK_MII | HAS_ACPI |
HAS_NWAY
| HAS_INTR_MITIGATION | HAS_PCI_MWI, tulip_timer,
t21142_media_task },
The alpha ev6 platform to my knowledge has never had ACPI, this one
surely doesn't, and checking my config the variables aren't even listed
compared to the ones enabled or commented for my other platforms.
It's possible that other alphas (ev67 or ev7s) may have but it's also
not likely. I know for sure the: ev4, ev45, ev5, and ev56 architectures
did not, as the ACPI standard hadn't been ratified, or wasn't around
long enough to make it into the production of the chipsets, and boards.
I will see if I can find a link between not having ACPI and this issue,
it's possible that the other instances you mentioned also have that same
issue. Or that they do have ACPI and have it disabled for 10 reasons or
another....
The second potential issue I see is that I don't know off-hand what PCI
MWI is...
It's only found in the tulip driver and nowhere else in the kernel:
root@...stellation:/tmp/tmp/linux-6.12.12/drivers/net/ethernet/dec/tulip#
grep -R HAS_PCI_MWI ../../../../../
grep: ../../../../../drivers/net/ethernet/dec/tulip/tulip.ko: binary
file matches
grep: ../../../../../drivers/net/ethernet/dec/tulip/eeprom.o: binary
file matches
grep: ../../../../../drivers/net/ethernet/dec/tulip/interrupt.o: binary
file matches
../../../../../drivers/net/ethernet/dec/tulip/tulip.h: HAS_PCI_MWI
= 0x01000,
../../../../../drivers/net/ethernet/dec/tulip/tulip_core.c: HAS_MII
| HAS_MEDIA_TABLE | CSR12_IN_SROM | HAS_PCI_MWI, tulip_timer,
../../../../../drivers/net/ethernet/dec/tulip/tulip_core.c: |
HAS_INTR_MITIGATION | HAS_PCI_MWI, tulip_timer, t21142_media_task },
../../../../../drivers/net/ethernet/dec/tulip/tulip_core.c: HAS_MII
| HAS_NWAY | HAS_8023X | HAS_PCI_MWI, pnic2_timer, },
../../../../../drivers/net/ethernet/dec/tulip/tulip_core.c: |
HAS_NWAY | HAS_PCI_MWI, tulip_timer, tulip_media_task },
../../../../../drivers/net/ethernet/dec/tulip/tulip_core.c: if
(!force_csr0 && (tp->flags & HAS_PCI_MWI))
grep: ../../../../../drivers/net/ethernet/dec/tulip/tulip.o: binary file
matches
grep: ../../../../../drivers/net/ethernet/dec/tulip/tulip_core.o: binary
file matches
It's defined as what looks labeled as a table flag in the tulip.h:
enum tbl_flag {
HAS_MII = 0x00001,
HAS_MEDIA_TABLE = 0x00002,
CSR12_IN_SROM = 0x00004,
ALWAYS_CHECK_MII = 0x00008,
HAS_ACPI = 0x00010,
MC_HASH_ONLY = 0x00020, /* Hash-only multicast
filter. */
HAS_PNICNWAY = 0x00080,
HAS_NWAY = 0x00040, /* Uses internal NWay xcvr.
*/
HAS_INTR_MITIGATION = 0x00100,
IS_ASIX = 0x00200,
HAS_8023X = 0x00400,
COMET_MAC_ADDR = 0x00800,
HAS_PCI_MWI = 0x01000,
HAS_PHY_IRQ = 0x02000,
HAS_SWAPPED_SEEPROM = 0x04000,
NEEDS_FAKE_MEDIA_TABLE = 0x08000,
COMET_PM = 0x10000,
};
Powered by blists - more mailing lists