lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <241a75e4-2322-8937-2bde-97a383284976@gmx.de>
Date:   Sat, 18 Sep 2021 21:37:17 +0200
From:   Lino Sanfilippo <LinoSanfilippo@....de>
To:     Vladimir Oltean <olteanv@...il.com>
Cc:     Andrew Lunn <andrew@...n.ch>,
        Florian Fainelli <f.fainelli@...il.com>,
        Saravana Kannan <saravanak@...gle.com>,
        "Rafael J. Wysocki" <rafael@...nel.org>, p.rosenberger@...bus.com,
        woojung.huh@...rochip.com, UNGLinuxDriver@...rochip.com,
        vivien.didelot@...il.com, davem@...emloft.net, kuba@...nel.org,
        netdev@...r.kernel.org, linux-kernel@...r.kernel.org
Subject: Re: [PATCH 0/3] Fix for KSZ DSA switch shutdown

Hi Vladimir,

On 15.09.21 at 07:42, Lino Sanfilippo wrote:
> On 14.09.21 at 20:48, Vladimir Oltean wrote:
>> On Mon, Sep 13, 2021 at 01:01:20PM +0200, Lino Sanfilippo wrote:
>>>>>> Could you post the full kernel output? The picture you've posted is
>>>>>> truncated and only shows a WARN_ON in rpi_firmware_transaction and is
>>>>>> probably a symptom and not the issue (which is above and not shown).
>>>>>>
>>>>>
>>>>> Unfortunately I dont see anything in the kernel log. The console output is all I get,
>>>>> thats why I made the photo.
>>>>
>>>> To clarify, are you saying nothing above this line gets printed? Because
>>>> the part of the log you've posted in the picture is pretty much
>>>> unworkable:
>>>>
>>>> [   99.375389] [<bf0dc56c>] (bcm2835_spi_shutdown [spi_bcm2835]) from [<c0863ca0>] (platform_drv_shutdown+0x2c/0x30)
>>>>
>>>> How do you access the device's serial console? Use a program with a
>>>> scrollback buffer like GNU screen or something.
>>>>
>>>
>>> Ah no, this is not over a serial console. This is what I see via hdmi. I do not have a working serial connection yet.
>>> Sorry I know this trace part is not very useful, I will try to get a full dump.
>>
>> Lino, are you going to provide a kernel output so I could look at your new breakage?
>> If you could set up a pstore logger with a ramoops region, you could
>> dump the log after the fact. Or if HDMI is all you have, you could use
>> an HDMI capture card to record it. Or just record the screen you're
>> looking at, as long as you don't have very shaky hands, whatever...
>>
>
> Yes, I will try to get something useful. I have already set up a serial connection
> now. I still see the shutdown stopping with your patch but I have not seen the
> kernel dump any more. I will try further and provide a dump as soon as I am successful.
>

Sorry for the delay. I was finally able to do some tests and get a dump via the serial console.
I tested with the latest Raspberry Pi kernel 5.10.y. Based on commit
4117cba235d24a7c4630dc38cb55cc80a04f5cf3. I applied your patches and got the following result
at shutdown:

raspberrypi login: [   58.754533] ------------[ cut here ]------------
[   58.760053] kernel BUG at drivers/net/phy/mdio_bus.c:651!
[   58.766361] Internal error: Oops - BUG: 0 [#1] SMP ARM
[   58.772376] Modules linked in: 8021q garp at24 tag_ksz tpm_tis_spi ksz9477_spi tpm_tis_core ksz9477 ksz_common tpm rts
[   58.837539] CPU: 3 PID: 1 Comm: systemd-shutdow Tainted: G         C        5.10.63-RP_PURE_510_VLADFIX+ #3
[   58.848388] Hardware name: BCM2711
[   58.852875] PC is at mdiobus_free+0x4c/0x50
[   58.858143] LR is at devm_mdiobus_free+0x1c/0x20
[   58.863853] pc : [<c08c9218>]    lr : [<c08c1898>]    psr: 80000013
[   58.871212] sp : c18fdc38  ip : c18fdc48  fp : c18fdc44
[   58.877505] r10: 00000000  r9 : c0867104  r8 : c18fdc5c
[   58.883823] r7 : 00000013  r6 : c31c8000  r5 : c3a50000  r4 : c379db80
[   58.891442] r3 : c2ab4000  r2 : 00000002  r1 : c379dbc0  r0 : c2ab4000
[   58.899037] Flags: Nzcv  IRQs on  FIQs on  Mode SVC_32  ISA ARM  Segment user
[   58.907297] Control: 30c5383d  Table: 03ac92c0  DAC: 55555555
[   58.914139] Process systemd-shutdow (pid: 1, stack limit = 0xff8113c1)
[   58.921774] Stack: (0xc18fdc38 to 0xc18fe000)
[   58.927285] dc20:                                                       c18fdc54 c18fdc48
[   58.936601] dc40: c08c1898 c08c91d8 c18fdc94 c18fdc58 c0866dac c08c1888 c31c819c c3527180
[   58.945921] dc60: c332d200 c1405048 c32f8800 c31c8000 00000000 bf191010 00000000 c32f8800
[   58.955289] dc80: c1095f3c c1aa6454 c18fdcac c18fdc98 c086715c c0866be8 c31c8000 00000000
[   58.964644] dca0: c18fdccc c18fdcb0 c0862c7c c0867128 c1a42e30 c31c8000 c14f7cf0 00000000
[   58.974018] dcc0: c18fdcdc c18fdcd0 c0862d40 c0862b68 c18fdcfc c18fdce0 c08613dc c0862d2c
[   58.983391] dce0: c31c8000 00000a68 c08ba6cc 00000000 c18fdd44 c18fdd00 c085c710 c086130c
[   58.992778] dd00: c0331394 c0332604 60000013 c18fdd74 c3656294 c1405048 c31c8000 c31c8000
[   59.002140] dd20: 00000000 c08ba6cc c160657c c155c018 c1095f3c c1aa6454 c18fdd5c c18fdd48
[   59.011521] dd40: c08ba6a8 c085c58c 00000000 00000000 c18fdd6c c18fdd60 c08ba6e4 c08ba670
[   59.020921] dd60: c18fdd9c c18fdd70 c085bc84 c08ba6d8 c18fdd8c c3656200 c3656394 c1405048
[   59.030334] dd80: c18fdda4 c32f8800 c32f8800 00000003 c18fddbc c18fdda0 c08bab7c c085bc20
[   59.039737] dda0: c32f8b80 c32f8800 00000000 c160657c c18fdddc c18fddc0 bf182554 c08bab4c
[   59.049164] ddc0: c1aa6400 c1a6e810 c1aa6410 c160657c c18fddf4 c18fdde0 bf1825a8 bf18252c
[   59.058602] dde0: c1aa6414 c1a6e810 c18fde04 c18fddf8 c0863dec bf182598 c18fde3c c18fde08
[   59.068057] de00: c085fd9c c0863dcc c18fde3c c1095f2c c024865c 00000000 00000000 620bef00
[   59.077487] de20: c140f510 fee1dead c18fc000 00000058 c18fde4c c18fde40 c0249c84 c085fc0c
[   59.086920] de40: c18fde64 c18fde50 c0249d74 c0249c4c 01234567 00000000 c18fdf94 c18fde68
[   59.096386] de60: c024a018 c0249d64 c18fded4 c31b0c00 00000024 c18fdf58 00000005 c0441cec
[   59.105852] de80: c18fdec4 c18fde90 c0441b30 c049852c 00000000 c18fdea0 c073ad04 00000024
[   59.115330] dea0: c31b0c00 c18fdf58 c18fded4 c31b0c00 00000005 00000000 c18fdf4c c18fdec8
[   59.124821] dec0: c0441cec c0425cb0 c18fded0 c18fded4 00000000 00000005 00000000 00000024
[   59.134317] dee0: c18fdeec 00000005 c0200074 bec45250 00000004 bec45f62 00000010 bec45264
[   59.143792] df00: 00000005 bec4531c 0000000a b6d10040 00000001 c0200e70 ffffe000 c1546a80
[   59.153282] df20: 00000000 c0467268 c18fdf4c c1405048 c31b0c00 bec4528c 00000000 00000000
[   59.162787] df40: c18fdf94 c18fdf50 c0441e6c c0441c50 00000000 00000000 00000000 00000000
[   59.172269] df60: c18fdf94 c1405048 c0331394 c1405048 bec4531c 00000000 00000000 00000000
[   59.181763] df80: 00000058 c0200204 c18fdfa4 c18fdf98 c024a16c c0249f10 00000000 c18fdfa8
[   59.191250] dfa0: c0200040 c024a160 00000000 00000000 fee1dead 28121969 01234567 620bef00
[   59.200735] dfc0: 00000000 00000000 00000000 00000058 00000fff bec45be8 00000000 00476b80
[   59.210245] dfe0: 00488e3c bec45b68 004734a8 b6e4ca38 60000010 fee1dead 00000000 00000000
[   59.219759] Backtrace:
[   59.223546] [<c08c91cc>] (mdiobus_free) from [<c08c1898>] (devm_mdiobus_free+0x1c/0x20)
[   59.232909] [<c08c187c>] (devm_mdiobus_free) from [<c0866dac>] (release_nodes+0x1d0/0x220)
[   59.242551] [<c0866bdc>] (release_nodes) from [<c086715c>] (devres_release_all+0x40/0x60)
[   59.252132]  r10:c1aa6454 r9:c1095f3c r8:c32f8800 r7:00000000 r6:bf191010 r5:00000000
[   59.261338]  r4:c31c8000
[   59.265239] [<c086711c>] (devres_release_all) from [<c0862c7c>] (device_release_driver_internal+0x120/0x1c4)
[   59.276479]  r5:00000000 r4:c31c8000
[   59.281440] [<c0862b5c>] (device_release_driver_internal) from [<c0862d40>] (device_release_driver+0x20/0x24)
[   59.292802]  r7:00000000 r6:c14f7cf0 r5:c31c8000 r4:c1a42e30
[   59.299900] [<c0862d20>] (device_release_driver) from [<c08613dc>] (bus_remove_device+0xdc/0x108)
[   59.310267] [<c0861300>] (bus_remove_device) from [<c085c710>] (device_del+0x190/0x428)
[   59.319748]  r7:00000000 r6:c08ba6cc r5:00000a68 r4:c31c8000
[   59.326896] [<c085c580>] (device_del) from [<c08ba6a8>] (spi_unregister_device+0x44/0x68)
[   59.336583]  r10:c1aa6454 r9:c1095f3c r8:c155c018 r7:c160657c r6:c08ba6cc r5:00000000
[   59.345924]  r4:c31c8000
[   59.349971] [<c08ba664>] (spi_unregister_device) from [<c08ba6e4>] (__unregister+0x18/0x20)
[   59.359870]  r5:00000000 r4:00000000
[   59.364972] [<c08ba6cc>] (__unregister) from [<c085bc84>] (device_for_each_child+0x70/0xb4)
[   59.374899] [<c085bc14>] (device_for_each_child) from [<c08bab7c>] (spi_unregister_controller+0x3c/0x128)
[   59.385979]  r6:00000003 r5:c32f8800 r4:c32f8800
[   59.392086] [<c08bab40>] (spi_unregister_controller) from [<bf182554>] (bcm2835_spi_remove+0x34/0x6c [spi_bcm2835])
[   59.404000]  r7:c160657c r6:00000000 r5:c32f8800 r4:c32f8b80
[   59.411084] [<bf182520>] (bcm2835_spi_remove [spi_bcm2835]) from [<bf1825a8>] (bcm2835_spi_shutdown+0x1c/0x38 [spi_bc)
[   59.423755]  r7:c160657c r6:c1aa6410 r5:c1a6e810 r4:c1aa6400
[   59.430847] [<bf18258c>] (bcm2835_spi_shutdown [spi_bcm2835]) from [<c0863dec>] (platform_drv_shutdown+0x2c/0x30)
[   59.442613]  r5:c1a6e810 r4:c1aa6414
[   59.447635] [<c0863dc0>] (platform_drv_shutdown) from [<c085fd9c>] (device_shutdown+0x19c/0x24c)
[   59.457932] [<c085fc00>] (device_shutdown) from [<c0249c84>] (kernel_restart_prepare+0x44/0x48)
[   59.468135]  r10:00000058 r9:c18fc000 r8:fee1dead r7:c140f510 r6:620bef00 r5:00000000
[   59.477470]  r4:00000000
[   59.481509] [<c0249c40>] (kernel_restart_prepare) from [<c0249d74>] (kernel_restart+0x1c/0x60)
[   59.491653] [<c0249d58>] (kernel_restart) from [<c024a018>] (__do_sys_reboot+0x114/0x1f8)
[   59.501359]  r5:00000000 r4:01234567
[   59.506447] [<c0249f04>] (__do_sys_reboot) from [<c024a16c>] (sys_reboot+0x18/0x1c)
[   59.515628]  r8:c0200204 r7:00000058 r6:00000000 r5:00000000 r4:00000000
[   59.523857] [<c024a154>] (sys_reboot) from [<c0200040>] (ret_fast_syscall+0x0/0x28)
[   59.533038] Exception stack(0xc18fdfa8 to 0xc18fdff0)
[   59.539607] dfa0:                   00000000 00000000 fee1dead 28121969 01234567 620bef00
[   59.549318] dfc0: 00000000 00000000 00000000 00000058 00000fff bec45be8 00000000 00476b80
[   59.559026] dfe0: 00488e3c bec45b68 004734a8 b6e4ca38
[   59.565596] Code: ebfe49f5 e89da800 ebed72a3 e89da800 (e7f001f2)
[   59.573246] ---[ end trace 7d800ce7b5664bb6 ]---
[   59.579413] Kernel panic - not syncing: Attempted to kill init! exitcode=0x0000000b
[   59.588634] Rebooting in 10 seconds..

The concerning source code line 651 is in my case:

void mdiobus_free(struct mii_bus *bus)
{
	/* For compatibility with error handling in drivers. */
	if (bus->state == MDIOBUS_ALLOCATED) {
		kfree(bus);
		return;
	}

651<	BUG_ON(bus->state != MDIOBUS_UNREGISTERED);
	bus->state = MDIOBUS_RELEASED;

	put_device(&bus->dev);
}
EXPORT_SYMBOL(mdiobus_free);

I tested with both versions of your patchset, with the same result. I also tested
with a RP 5.14 kernel (the latest RP kernel) but I did not see the original issue
(i.e. the system hang) here for some reason.

I then tried to get the net-next kernel running on my system but without success so far. So for
now the result with the RP 5.10 is all I can offer. I hope that helps a bit nevertheless.

Regards,
Lino



Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ